Jun 6, 2008

Reconstructing the Dow

Recently I had to reconstruct the Dow Jones Industrial Index for backtesting purposes. This turned out to be more painful than anticipated. In case you need to do this, I recommend you start out with this document detailing the historical composition of the DJIA. From this, create a .txt file containing dates and types of change over the relevant time interval. Write some code to read this into your preferred programming environment (MatLab in my case) and create a data structure containing the composition of the Dow at any given time point (daily closings, in my case). Then look up as many ticker symbols as possible at Yahoo finance and the Dow's wikipedia entry. For the rest, I googled, though there's probably some sort of central list of tickers maintained somewhere. I'll list below what I could find for the years between 1990 and 2008. Note that many of the tocker symbols today denote different companies.

3M Company
MMM
AT&T Corporation
T
AT&T Incorporated
T
Alcoa Incorporated
AA
Allied-Signal Incorporated
ALD (ALD today stands for Allied Capital Corporation) ALD merged with Honeywell
AlliedSignal Incorporated
ALD (again, today Allied Capital Corporation)
Altria Group Incorporated
MO
Altria Group, Incorporated
MO
Aluminum Company of America
AA
American Express Company
AXP
American International Group Inc.
AIG
American Tel. & Tel.
T
Bank of America Corporation
BAC
Bethlehem Steel
BS (Delisted)
Boeing Company
BA
Caterpillar Incorporated
CAT
Chevron
CVX
Chevron Corporation
CVX
Citigroup Incorporated
C
Coca-Cola Company
KO
Du Pont
DD
DuPont
DD
Dupont
DD
Eastman Kodak Company
EK
Exxon Corporation
XOM
Exxon Mobil Company
XOM
Exxon Mobil Corporation
XOM
General Electric Company
GE
General Motors Corporation
GM
Goodyear
GT
Hewlett-Packard Company
HPQ
Home Depot Incorporated
HD
Honeywell International
HON
Honeywell International Inc.
HON
Intel Corporation
INTC
International Business Machines
IBM
International Paper Company
IP
J.P. Morgan & Company
JPM
J.P. Morgan Chase
JPM
J.P. Morgan Chase & Company
JPM
Johnson & Johnson
JNJ
McDonald’s Corporation
MCD
Merck & Company, Inc.
MRK
Merck & Company, Incorporated
MRK
Microsoft Corporation
MSFT
Minnesota Mining & Mfg
MMM
Navistar International Corp.
NAVZ.PK (Only on Pink Sheets, delisted from NYSE in 2006)
Pfizer Incorporated
PFE
Philip Morris Companies Inc.
PM
Phizer Incorporated
PFE
Primerica Corporation
??? I have no idea.
Procter & Gamble Company
PG
SBC Communications Incorporated
SBC (delisted after at&t fusion)
Sears Roebuck & Company
S (S now stands for Sprint)
Texaco Incorporated
TX (now stands for ternium)
Travelers Group
TRV (now stands for Travelers Company; unrelated company ! )
USX Corporation
X
Union Carbide
UK (delisted)
United Technologies Corporation
UTX
Verizon Communications Inc.
VZ
Wal-Mart Stores Incorporated
WMT
Walt Disney Company
DIS
Westinghouse Electric
WX (Now stands for Wuxi pharma)
Woolworth
WOW (probably)

Next, link the company names to theire respective ticker symbols, and download stock quotes for all the tickers/date combinations. In MatLab, this is most conveniently done using this routine by Marcelo Scherer Perlin, which acesses free Yahoo datasets. For the delisted titles, or intra-day data, you'll have to resort to proprietary datasets. Opentick may be a good free alternative, but I haven't got around to look at it more closely.

Finally, you'd have to reconstruct the index from the individual quotes. Here's an explanation how the DJIA is calculated. You'll notice you need to know historical values for the so called Dow divisor which, as far as I know, are impossible to obtain in electronic format with reasonable effort. Fortunately, you can backward -compute them from any given single value by assuming that splits, dividends, and changes in the DJIA composition should not have an effect on the index value. This is admittedly somewhat pointless, as historical index data can be readily obtained, but it can serve as sort of a check-sum for the individual quotes you have.


No comments: