## Multidimensional Scaling

Given a set of distances (dis-similarities) between objects, is it possible to recreate a dimensional representation of those objects?

Model: Distance = square root of sum of squared distances on k dimensions dxy = √∑(xi-yi)2

Data: a matrix of distances

Find the dimensional values in k = 1, 2, ... dimensions for the objects that best reproduces the original data.

Example: Consider the distances between nine American cities. Can we represent these cities in a two dimensional space.

BOS     CHI     DC      DEN     LA      MIA     NY      SEA     SF
BOS     0       963     429     1949    2979    1504    206     2976    3095
CHI     963     0       671     996     2054    1329    802     2013    2142
DC      429     671     0       1616    2631    1075    233     2684    2799
DEN     1949    996     1616    0       1059    2037    1771    1307    1235
LA      2979    2054    2631    1059    0       2687    2786    1131    379
MIA     1504    1329    1075    2037    2687    0       1308    3273    3053
NY      206     802     233     1771    2786    1308    0       2815    2934
SEA     2976    2013    2684    1307    1131    3273    2815    0       808
SF      3095    2142    2799    1235    379     3053    2934    808     0

This can be done in R by using the cmdscale function. First copy the distances from above to the clipboard. Then use the following commands:

source("http://personality-project.org/r/useful.r")     #get some extra functions, including read.clipboard()

cities   #show the data
city.location <- cmdscale(cities, k=2)    #ask for a 2 dimensional solution
round(city.location,0)        #print the locations to the screen
plot(city.location,type="n", xlab="Dimension 1", ylab="Dimension 2",main ="cmdscale(cities)")    #put up a graphics window
text(city.location,labels=names(cities))     #put the cities into the map

The output gives us the the original distance matrix (just to make sure we put it in correctly, the x,y coordinates for each city, and then the following graph.

> cities   #show the data
BOS  CHI   DC  DEN   LA  MIA   NY  SEA   SF
BOS    0  963  429 1949 2979 1504  206 2976 3095
CHI  963    0  671  996 2054 1329  802 2013 2142
DC   429  671    0 1616 2631 1075  233 2684 2799
DEN 1949  996 1616    0 1059 2037 1771 1307 1235
LA  2979 2054 2631 1059    0 2687 2786 1131  379
MIA 1504 1329 1075 2037 2687    0 1308 3273 3053
NY   206  802  233 1771 2786 1308    0 2815 2934
SEA 2976 2013 2684 1307 1131 3273 2815    0  808
SF  3095 2142 2799 1235  379 3053 2934  808    0
> city.location <- cmdscale(cities, k=2)    #ask for a 2 dimensional solution
> round(city.location,0)        #print the locations to the screen
[,1] [,2]
BOS -1349 -462
CHI  -428 -175
DC  -1077 -136
DEN   522   13
LA   1464  561
MIA -1227 1014
NY  -1199 -307
SEA  1596 -639
SF   1697  132

This solution can be represented graphically:

Note that the solution is not quite what we expected (it is giving us a mirrored Australian orientation to American cities.) However, by reversing the signs in city.location, we get the more conventional representation:

city.location <- -city.location
plot(city.location,type="n", xlab="Dimension 1", ylab="Dimension 2",main ="cmdscale(cities)")    #put up a graphics window
text(city.location,labels=names(cities))     #put the cities into the map

(Using the maps package we can compare this solution to a map of the US.

map("state")

A useful feature is R is most commands have an extensive help file. Asking for help(cmdscale) shows that R includes a distance matrix for 20 European cities. The following commands (taken from the help file) produce a nice two dimensional solution. (Note that since dimensions are arbitrary, the second dimension needs to be flipped to produce the conventional map of Europe.)

loc <- cmdscale(eurodist)
x <- loc[,1]
y <- -loc[,2]
plot(x, y, type="n", xlab="", ylab="", main="cmdscale(eurodist)")
text(x, y, names(eurodist), cex=0.8)