login | register
Feb 09, 2010 [01:02 PM]

onNYTurf

Refresh cache

It Is High Time The MTA Offer Its Schedule Data As A Public Feed

by Will
Tuesday Aug 28, 2007
Posted to Front Page Posts
The other day there was news that the MTA and Google would be collaborating on a new transit service site of some sort. This is a great opportunity for the MTA to make its schedule data publicly available in a way that is easy for software developers to use. The question is what path will the MTA take? Until know they have been a frustrating and difficult agency for developers to work with and a new deal with Google raises the specter of unfair play. onNYTurf has been trying for two months to get the same schedule data Google will need, from the MTA, and for the most part we have been met with stalling and excuses (more on that below).

Why This Is A Great Opportunity

A likely first step is that MTA schedule data will be incorporated into Google's Transit Planning Project. The question this begs is will Google be the only party getting the data, or map artwork, or anything else from the MTA? Anything the MTA gives Google should also be available to the public.

The MTA schedule data should be made available to the public at large in an easy to use structured data format, so that anyone interested in developing a web service based on it can do so. To work with Google's Transit directions sytem, the MTA will have to create just such a data feed. When they do, they should make access to it completely public. After all we all pay for it with taxes and fares.

There are lots of things you can do with schedule data, not just build a transit planner. One thing you can do with schedule data is you can analyze the reach of a transit system. You can make maps that show what area can reach a target destination in say 30 min or 1 hr. This can be vital to city planning, or rationalizing a rezoning. It is also vital to modeling expansions or changes to transit services. Put in the public's hands anyone can explore where might be the best places to build a new downtown, or how effective a new rapid bus service might be relieving congestion on existing subway services. Until now we have had to rely on agencies like the DOT to not lie to us, or NFPs to raise tons of money to conduct studies. It doesn't have to be this way! One intent of onNYTurf's use of NYC transit data is to create such a service for modeling changes to the transit system. But we can't do it if the MTA plays favorites and does not make its data public and easy to use! This is why onNYTurf's lawyer filed a Freedom Of Information Law request for a database dump of all MTA schedule data this past June!

MTA Stalls In Response To onNYTurf FOIL Request for Data

Currently schedule data is only available to the public in PDF form. This is such a nightmare to cull data from, which is why we want structured data. In addition to FOILing the MTA we also FOILed each agency of the MTA, such as NYC Transit, Metro North, etc independently. So far we have been met with months of delays! A couple of agencies have sent the information, L.I.Bus and reportedly a CD from MetroNorth is in the mail. Meanwhile we also received a letter from the MTA saying they do not maintain such data, thank you have a nice day. Which was quite silly, in that they did not even suggest that we should file the request with each agency maybe. Likewise NYC Transit has delayed for more than two months and the latest is that we will have a response to our request in October from them - they are "researching" it. This is completely ridiculous!

How Transit Planners Work And How Data Is Maintained

For a transit planner to work, all the schedule data generally is stored in what is basically one database. Exporting a database requires all the technical overhead of executing one command on the computer! Something like: pg_dump databasebname > outfile Ladies and gentlemen this is not rocket science, it is only computer science. It is very likely that the MTA's computer people could export ALL MTA schedule data at once into one file. The process should take 30 minutes at most. Maybe a an hour to also burn a CD. If the MTA can keep two sets of books (fraudulently remember!) they certainly can make a copy of their schedule data in a timely way.

Why should they be able to dump all their data from one database if it seems to be managed by each agency? Because from what I have learned from an inside source, it is my understanding that actually Trips123.com is contracted with the MTA to maintain all this data and provides the MTA with its current trip planner service. Trips123 also offers a trip planner on its site, one of the apparent perks of its contract. So all the schedule information is in one place. Unfortunately Trips123 has a lock on this information. So to be fair, Google will not be the first to have exclusive access to structured data from the MTA, maybe just the second. But where does that leave the rest of us who could put such data to a variety of good uses?

Will onNYTurf Have Its FOIL Answered Before Google Gets The Same Data?

We doubt it. We suspect Google may actually already have the data - but that is just speculation.

The Only Reasonable Solution: Public Data Feed

Other cities make their transit data available as a public data feed, why can not the MTA?! With a feed they would maintain the data (which they already do) and update it regularly and publish it on the web for anyone to access. This is already done in Portland and in San Francisco. This makes so much more sense. One, its not some exclusive access bullshit deal for a private company of what is actually publicly paid for information. Two, it means the data is up to date and uniform across all uses. This would improve overall reliability of all NYC transit schedule services. Right now it is anyones guess how up to date HopStop, Brail, etc are, or soon to be onNYTurf's subway directions services will be.

We will be making another FOIL request to the MTA in the coming days to attempt to secure any transit feed data or other information they prepare for Google. If we get it we will certainly make it publicly available.

Update: and btw we are ready to go with putting out a very robust trip planner - we have the software to do it, including walking directions to stations. We just need the data!



Comments

Comments Filter
Reply to this comment

HELL YES. It's Our Data and We Want It

by , Tuesday Aug 28, 2007 [08:57 PM]
I agree entirely. Our government gives away valuable assets to corporate exploitation at every level: digital broadcast spectrum, mining rights on Federal land, and this data is another example. I don't want to hear another word about Google as heroes of Net Neutrality and fair competition if they're going to pursue a business plan based on monetizing exclusive transit data created by a public utility. -M.O.
Reply to this comment

by Joe Hughes, Wednesday Aug 29, 2007 [05:11 AM]
Will, you're right, it's certainly to transit agencies' benefit to make their schedule available to the public in a machine-readable format like GTFS. (This is something I've talked about before on my blog.) Hopefully as time goes forward more people will build interesting and useful things with the feeds that are already available to help demonstrate to agencies how they and their riders can benefit from the energy and ideas that independent developers like you bring to the table.
Reply to this comment

Re:

by admin, Thursday Aug 30, 2007 [02:13 PM]
thanks Joe. TriMet was awesome because I think they were the first to make this easy for developers to use. But heck I would settle for a database dump that powers the transit planner on the MTA site. That would a snap to work with!
Tags: GovernmentMTA

Blogs

Recent News From onNYTurf

NYC's Best Subway Map

NYC Subway Map

Tags cloud

Resist

Remember Guernica

Ad Free Blog

Ad Free

Keeping Count