## Description

1) Simulation:

a) Write a function to simulate N uniformly-drawn points within bound polyhedra. The function

SimPolyHedra would proceed by a 1st stage of uniformly drawing (possibly >N) points within

a box that is bounding the desired polyhedra, followed by a 2nd stage of filtering, so that the

function outputs only the first N of those points that fall within the polyhedra. The 2nd stage

would retain a point iff it is in any of p1 convex polyhedra that are provided as input. The input

arguments are:

N: The number of points to simulate

Bounds: A real D×2 matrix, each of its rows specifying the (lower,upper) bounds of a Ddimensional box from which all points are drawn at the 1st stage

Polyhedra: A cell-array of real matrices M

1

,…,Mp

. M

i

is of size fi×(D+1). It defines a convex

polyhedron of fi faces, as all the vectors x in R

D

such that

[

] ⃗ . Each face

is thus defined by a row of M

i

interpreted as a hyperplane in R

D

. Note that M

i

may be unbounded, by devfined faces, as we only consider its intersection with

the Bounds box.

Output:

X: A real N×D matrix, each of its rows specifying a vector that is inside the

Bounds box and inside at least one of the polyhedra M

i

. To generate those you

draw potential D dimensional vectors x whose transpose could serve as rows of X

. You would then check whether to include each row, i.e. whether any for each

such x there exists at least one M

i

such that the

[

] ⃗ condition is satisfied.

[20 points]

b) Use the above to write a function SimTanzania() that simulate points of particular colors in

the Tanzanian flag (see attached Tanzania.pdf). This flag spans the axis-bounded rectangle

between (0,0) and (1.5,1.0), and has 5 regions in green, yellow, black, yellow and cyan, separated

by the lines

Draw points inside the

planar rectangle of the flag, and save 4 text files, each of N=50 rows and D=2 columns of

numbers in text, specifying 50 points of the appropriate color: Tanzania_green.txt,

Tanzania_cyan.txt,Tanzania_black.txt and Tanzania_yellow.txt .

[5 points]

2) Probabilistic interpretation:

a) Consider SimPoly of Assignment2, Question 1 as defining a probability space of potential

outputs y. As such, it defines a probability density function f (y) over possible values of y= y .

Of course, this probability space is different for each input, so f (y) depends on the inputs

RealThetas, sigma and x. Denote it fθ,,x (y) for RealThetas=θ, sigma= and

x=x.Prove that the least-squares regression resultθ=θ* maximizes fθ,,x (y) for any ,x and y.

[15 points]

b) Consider SimLogistic of Assignment2, Question 3 , with zero noise, as defining a probability

space of potential outputs y. As such, it defines a probability function P(y) over possible values

of y= y . Of course, this probability space is different for each input, so P (y) depends on the

inputs RealThetas and x. Denote it Pθ, x (y) for RealThetas=θ and x=x. Prove that

the logistic regression resultθ=θ* maximizes Pθ,x (y) for any x and y.

[15 points]

Guidance: Neither 2a nor 2b requires computing derivatives. Both can be solved by the definition of

θ

*

as an ERM, so you are welcome to just use that.

3) Optional:

Prepare a single sided, single page, 12-font English-letter (with potential notations in Greek) cheat

sheet for the quiz scheduled for Feb 19th. This would be the only allowed material.

[0 points]

Good luck!