Something is wrong with college.
I said the same thing about public transit sometime in 1994. I think transportation is fundamentally an information problem. I even made an itinerary planner to help solve it. In a way, my decade-long IT career was shaped by trying to solve that one problem using databases, XML, and the (mobile) web.
Then I realized that it's not just information, but the social conventions behind hopping on a bus instead of a car. You know, convenience, safety, privacy, stuff like that. The problem I've been hacking away at over these past few years. Do I have an answer to the social stuff? No. But I have a nice web program that plots itineraries. It kinda sorta works on my cell phone too.
What's wrong with college again? The bit about transferring credits seems like an offshoot of the database queries I wrote for hopping on a bus, making a connection somewhere, and arriving at your destination. Really. Only this time, you want to take courses at the most affordable place, and transfer them into the "best" place you want your terminal degree awarded from. Only, despite articulation agreements, these transfers don't always go down as planned.
These agreements aren't rigidly (enough) specified. There's confusion, misinformation, misunderstandings, etc., and college administrators prefer to err on the side of caution, while the students pay the bills for extra credits that may or may not be needed. I chatted with one of my students about this same issue after class, upon learning that he'd transferred from the same community college that just hired me for the Spring.
There are a bunch of issues here-- colleges have different strengths, missions. None can be the best at everything. So does it make sense for students to cross-register, like they do in some other college towns? (Amherst, MA is the leading example that comes to mind.)
But one big stumbling block is transportation-- how do they go from one campus to the next? Not so easy, and inter-campus transportation is almost nonexistent in Albany. Riding the bus from one to another can take more than an hour between service frequency and transfer waiting times.
In short, it may be worthwhile for a college to "outsource" courses, especially the intro ones, to someone able/willing to take it over at lower labor rates and better matched facilities. What if a community college is more adept at teaching "tool" courses? Should the larger college just "outsource" the class, but with hands-on involvement?
Or better yet, pool investments in developing online courses to be shared jointly. Courses don't have to be entirely online- they ignore the social aspects of learning. But they can be mediated by technology- schools share the costs of building the online portions (or even using the online content from the textbook), mix it up with locally-developed content to fit with their needs, and combine that with the classroom experience. If you can cut classroom time, or even move the classroom around from campus to campus, we've got the beginning of reengineering the process of college education.
After all, if you could harness the intellectual energy that students channel into Facebook and Myspace every day, they might have a fighting chance at learning something a little better. Both raise the bar for education and make it a little more consistent-- based on the course rather than the reputation / stereotype of the granting institution.
And yea, public transportation has to be radically reengineered too. I just haven't figured out that part yet. Nobody has, apparently. But I'm going to hack away at the problem just a little bit in my dissertation next year, which seems to be headed towards an information system for collaborative transportation systems between colleges. Or something like that.
Random notes about balancing work, school, family life, teaching, and research in transportation, social and mobile computing while finishing a PhD in Information Science.
Friday, November 30, 2007
Tuesday, November 27, 2007
The ugliest damn code I've written in a while...
This is so awful, it was worth a post. I'm using "legacy" XSL, a standard that was obsolete within a few months, back in 1999. But, I can't find my "good" code, and I don't have time to redevelop it. So I'm stuck using what I put together almost 8 years ago. Having said that, it's the rather isolating feeling of not knowing a single person in Albany who knows anything about XSL (or XSLT for that matter).
I'd like to post it in the XSL Hall of Shame, but it won't show up without hacking all the brackets. (Blogger thinks it's ill-formed HTML. Close, but no cigar.)
But the GPS analysis / web-based questionnaire is just about done, though my brain is choking on some of the SQL queries-- the kind that separate the power coders from the newbies. I think I've gone from the former to the latter after 3 years away from cubicle life. I've had to resort to my Led Zeppelin albums to get back into gear. (Ah, I've written so much code to 70's and 90's rock. I can skip the 80's, please--once was quite enough.)
I'd like to post it in the XSL Hall of Shame, but it won't show up without hacking all the brackets. (Blogger thinks it's ill-formed HTML. Close, but no cigar.)
But the GPS analysis / web-based questionnaire is just about done, though my brain is choking on some of the SQL queries-- the kind that separate the power coders from the newbies. I think I've gone from the former to the latter after 3 years away from cubicle life. I've had to resort to my Led Zeppelin albums to get back into gear. (Ah, I've written so much code to 70's and 90's rock. I can skip the 80's, please--once was quite enough.)
Sunday, November 18, 2007
XML Still Rocks
If there was ever a turning point in my development career this past decade, it came when I saw this article on loading GIS data from the Census Bureau (TIGER) onto SQL Server, rendering it with XML, and then vectorizing it with VML to draw maps. At that point, I realized that my passing hobby in building a public transit itinerary planner could actually be feasible. I mean, a bit ol' list of bus station names is one thing, but the ability to grab them, locate them, and throw them on the map was absolutely da bomb.Keep in mind, this was still February 2000, and well before Google Maps, Mapquest was still fairly new. I'd been eagerly waiting for each new edition of a few different online mags for programmers. I'm a largely self-trained developer, my experience largely limited to BASIC and C back in the late 1980's, before being abruptly yanked back into programming by the job market of the late 1990's. The Web at first seemed a joke, just a bunch of really slow pages loaded into a corner of America Online.
So, I dove into web dev starting in 1998, client/server, application servers, and drank the punch regarding Windows Distributed Network Architecture. And I had an idea for making a simple tool to help users go from Point A to Point B using the bus and train, something I first cooked up in Grad school in around 1994, but had no real way to develop and deploy to end users.
Long story short, it's near the end of 2007, and I'm back in Albany, back in school, and digging up code I wrote in 2000 and 2001, so that I can build and deploy my simple web-based application for reporting GPS readings. I'm not doing live processing of uploaded data- there were too many constraints on my crappy free web development account, most of which could be resolved by me coughing up about $100 for a better account with cool stuff like FTP uploads, MySQL access, and customer support.
And I'm copying and debugging code that I wrote in a flurry of competence more than 7 years ago. And I'm slowly remembering the various gotchas with XML and XSL. Like I used to say, when it works, it works great... But it's sort of fun to revisit again, though XSLT is a little frustrating at times when you're out of practice.
Thursday, November 15, 2007
Daddy can't read....
Upon lots of prompting from my daughter, I finally found some of her baby pictures. Ideally, I'd have framed these and hung them on the walls. I have both bare walls and blank picture frames, kind of one of those 2 + 2 solutions. But the frames are for 3x5 pics, and our pics are 4x6, and I have no desire to cut them down, and no ambition to find the negatives and have the pictures reduced.
But it reminded me the short interlude a few years ago, when she started talking, until her hard transition from only child to older sister occurred with a new baby in the house. I found that she could memorize a whole book, if I'd read it a few times, the usual Dr. Seuss fare. She'd even correct me if I misread a word, since I was often half asleep and sometimes paraphrasing from memory. I used to read whatever she picked, though it was often a Japanese kids picture book. So I had to make up a story to fit the pictures.
Until one day when my daughter was about 20 months old, she confided her secret discovery, one that caused her a lot of concern, while my wife was reading this same storybook. "Mommy," she interrupted, frowning, full of concern. "Daddy can't read." "Why?" my wife asked. "He doesn't know the real words!" my daughter replied...
But it brings me back to a question I was asked today: What do I read in my spare time. My answer: I don't think I have time to read, and I don't think I have spare time. Just barely time to go through the motions of reading...
But just as carrying around that GPS made me think about where I really go and how much time I spend in transit, I've realized that I do waste time, and I do read. Low quality time, low quality reading. As I write this, I recall something I used to say while still living in New Jersey- that you don't "find time" to do things any more than you "find money" to spend. You have to make it. I have to go back to making time to do worthwhile things again.
Then maybe I'll have time to read again. Something worthwhile, that's worth reading.
But it reminded me the short interlude a few years ago, when she started talking, until her hard transition from only child to older sister occurred with a new baby in the house. I found that she could memorize a whole book, if I'd read it a few times, the usual Dr. Seuss fare. She'd even correct me if I misread a word, since I was often half asleep and sometimes paraphrasing from memory. I used to read whatever she picked, though it was often a Japanese kids picture book. So I had to make up a story to fit the pictures.
Until one day when my daughter was about 20 months old, she confided her secret discovery, one that caused her a lot of concern, while my wife was reading this same storybook. "Mommy," she interrupted, frowning, full of concern. "Daddy can't read." "Why?" my wife asked. "He doesn't know the real words!" my daughter replied...
But it brings me back to a question I was asked today: What do I read in my spare time. My answer: I don't think I have time to read, and I don't think I have spare time. Just barely time to go through the motions of reading...
But just as carrying around that GPS made me think about where I really go and how much time I spend in transit, I've realized that I do waste time, and I do read. Low quality time, low quality reading. As I write this, I recall something I used to say while still living in New Jersey- that you don't "find time" to do things any more than you "find money" to spend. You have to make it. I have to go back to making time to do worthwhile things again.
Then maybe I'll have time to read again. Something worthwhile, that's worth reading.
Sunday, November 11, 2007
For the last F-ing time, I'm not a GIS major
It's not that I have anything against GIS. Far from it. It's interesting. It's neat. You can draw lots of pretty, colorful maps that may or may not help you understand something better.
It's just that, just because I'm still doing public transit research, everyone in the program brands me as GIS. Then I have to explain again what it is I'm doing, then they understand.
Until the next time I'm spotted, and introduced as "the GIS major" in my cohort in the PhD program. It's gotten more popular. A lot of the new students I've met are at least considering GIS as a major. Naturally, they're recommended by some of our faculty to ask ME how I'm doing as a GIS major. Because I'm the famous GIS major in my year.
Except I'm not.
Sorry, I have to get back to my program / relational database analysis of GPS readings. It's all in SQL. There's no pictures, no colors, no choropleths. Just text, tables, and the occasional chart where I churn data and look for inflections to build into a story.
It's just that, just because I'm still doing public transit research, everyone in the program brands me as GIS. Then I have to explain again what it is I'm doing, then they understand.
Until the next time I'm spotted, and introduced as "the GIS major" in my cohort in the PhD program. It's gotten more popular. A lot of the new students I've met are at least considering GIS as a major. Naturally, they're recommended by some of our faculty to ask ME how I'm doing as a GIS major. Because I'm the famous GIS major in my year.
Except I'm not.
Sorry, I have to get back to my program / relational database analysis of GPS readings. It's all in SQL. There's no pictures, no colors, no choropleths. Just text, tables, and the occasional chart where I churn data and look for inflections to build into a story.
Wednesday, November 07, 2007
Just call me Professor ABD...
I got a teaching job. In January, I'll be a full-time Computer Science professor. Or when the paperwork goes through and makes it official. I'm psyched.
In the PhD program, if there's one thing they repeat ad nauseam, it's that you should never, ever take on a full time job before graduation. Then, I guess you get comfortable, lose momentum, and ultimately drop out, joining the hoardes of ABD's who never pull it together and finish their dissertation. (God knows that PhD candidates shouldn't be allowed to relax.) It's a real issue, and my references, who are also on my dissertation committee, made that point pretty clear to my future Department Chair. But he hired me anyways, and is giving me a lighter teaching load my first year to help me finish-- I have just one new course to prepare, and the rest are courses I've already done as an adjunct.
Despite the number of colleges in and around Albany (the Capital District), the world of academia is even smaller- I think everyone knows everyone here. If that isn't a Social Network Analysis paper waiting to happen, I don't know what is.
I hope I can have a coffeemaker in my new office.
In the PhD program, if there's one thing they repeat ad nauseam, it's that you should never, ever take on a full time job before graduation. Then, I guess you get comfortable, lose momentum, and ultimately drop out, joining the hoardes of ABD's who never pull it together and finish their dissertation. (God knows that PhD candidates shouldn't be allowed to relax.) It's a real issue, and my references, who are also on my dissertation committee, made that point pretty clear to my future Department Chair. But he hired me anyways, and is giving me a lighter teaching load my first year to help me finish-- I have just one new course to prepare, and the rest are courses I've already done as an adjunct.
Despite the number of colleges in and around Albany (the Capital District), the world of academia is even smaller- I think everyone knows everyone here. If that isn't a Social Network Analysis paper waiting to happen, I don't know what is.
I hope I can have a coffeemaker in my new office.
Sunday, November 04, 2007
Privet, kak dela?

I've been hammering away at the workflow for this GPS survey thing- where you upload your GPS logs and "it" finds your destinations and your travel routes, and then asks you for more information about it. Somehow, I think I've gone off the deep end with this thing, but after all, I did grab over three weeks of my life on GPS over the span of 2 1/2 months. The good news is that I have an even spread by day of week, though the sample is still skewed. (I only brought it with me when I was doing something I felt would be interesting for a GPS to log.)
So I finally settled on a quick reference chart, plotting the common log of count of readings (10 sec intervals) vs km/h. So I get about 93,000 readings of not moving anywhere (less than 2 km/h, allowing for reading errors), or nearly a 5 above. I'n sure there's overlap between modes: slow cars vs. fast busses...
Somehow, I'm thinking of the only phrase I remember in Russian, from when I studied it in college about 20 years ago: "Privet, Anna, kak dela?" It was textbook dialog from a "guard" to a student, taking place during the end of the Soviet era. "Hello, Anna, where are you going?" Though not quite idle chitchat, but part of the oppressive scrutiny Soviet citizens endured.
As useful as it seems, there is something still just a bit creepy about a little device that records your location every time the indicator light flashes. Even more so with the realization that the timekeeping aspect alone, you get atomic clock precision time for what will soon cost only a few bucks. With the added bonus that you also can get your location at regular intervals within a dozen feet or so, or maybe better with some good math and a predictive model. The kind of model that someone smarter than me can make compact and efficient without all the SQL mucking about.
Coupled with pervasive wireless networks, rapid increases in miniaturization and reductions in power use, and you start to wonder. Could you make an inert chip like RFID, where it is only powered from time to time by a passing RF pulse. Only this time, it powers long enough to grab a satellite signal and find an open Wi Fi port to beam data elsewhere. God knows you have to bathe cheap RFID chips in a lot of RF to get a reliable signal from every box on a pallet. Imagine that your "active RFID" modules are a bit more active than you might realize.
But there are a lot of other people working on the same thing as me, and there's a certain inevitability to what will come from all this. Someone is going to be first, and there will be a lot of things to consider when these models are finally discovered and the devices are put into use. One would hope that it'll all just get locked down by IP from a firm otherwise inept at deploying the technology effectively. (No names, though!)
Saturday, November 03, 2007
Killed by Math
Math doesn't kill. But SQL queries that use a lot of math do get killed.
I'm trying to mine 150,000 GPS readings for places where I'm stopped. What does that mean? Places where my calculated speed is zero or near zero for "a spell". Which means what, with the errors between readings?
Basically, I chopped off the last three decimals for the latitude and longitude, and grouped records, which reduced the precision by (I think) +/- 25 feet, then group them, and get a count. But where the count is low, I want to eliminate those records.
How that panned out, after about an hour:
SELECT round(latitude,3) AS roundlat, round(longitude,3) AS roundlong, count(*) AS intcount
FROM CleanGPS
WHERE speed<2
GROUP BY round(latitude,3), round(longitude,3)
HAVING COUNT(*) > 10
ORDER BY count(*) desc;
But the downside is that I can get four different records in the result for what's probably the same point: latitude +/- .0005 degree, longitude +/- .0005.
Still, it reduced the original 15,000 records down to 500. It's a start. I'll probably have to requery the resultset, maybe chop it down to the nearest .01 degree and then backtrack to figure out where it actually was, from an average of the original data.
I'm trying to mine 150,000 GPS readings for places where I'm stopped. What does that mean? Places where my calculated speed is zero or near zero for "a spell". Which means what, with the errors between readings?
Basically, I chopped off the last three decimals for the latitude and longitude, and grouped records, which reduced the precision by (I think) +/- 25 feet, then group them, and get a count. But where the count is low, I want to eliminate those records.
How that panned out, after about an hour:
SELECT round(latitude,3) AS roundlat, round(longitude,3) AS roundlong, count(*) AS intcount
FROM CleanGPS
WHERE speed<2
GROUP BY round(latitude,3), round(longitude,3)
HAVING COUNT(*) > 10
ORDER BY count(*) desc;
But the downside is that I can get four different records in the result for what's probably the same point: latitude +/- .0005 degree, longitude +/- .0005.
Still, it reduced the original 15,000 records down to 500. It's a start. I'll probably have to requery the resultset, maybe chop it down to the nearest .01 degree and then backtrack to figure out where it actually was, from an average of the original data.
Thursday, November 01, 2007
Fun with GPS and MS Access (and VFP and maybe MySQL)
I'm working on my very late project-- the one I wanted to finish a month ago.
It's about distilling some intelligence from raw data logs-- turning thousands of lat/long readings into some kind of narrative that doesn't involve eyeballing Google KML files and trying to interpret the nice green bands it draws over its satellite photos of Albany.
The goal is to identify "points of interest" and then use a survey generator of my own creation to ask the user what he/she did there. It's all about helping the user reconstruct travel itineraries and possibly get a better understanding of how the user gets from Point A to Point B to do Pointless Errand C.
It dawned on me that I'll have to grab some kind of external GIS data to interpret the data. It's one thing to calculate headings (i.e. 0 is North, 90 is East, 180 is South, 270 is West, and 360 becomes north at 0 again). Or speed (I'm assuming that the earth is flat here, since reconstructing topography from the altitude readings is even more annoying than it's worth.)
So what better source of familiar free GIS data than Census TIGER data. I started working with it around New Year's Day 2000. In fact, it was the discovery of this data set that made me seriously pursue my whole transit itinerary planner project a few years ago, which ultimately led to me returning to Albany and getting into the PhD program.
And just by comparing the lat/long coordinates of a specific GPS reading, I can now run a query and return the street block the reading was taken. Given that I'll have anywhere from a few to dozens of readings per block, and multiple blocks per street traveled on, this could mean a huge reduction in data.
Now to come up with rules. Like how long spent at an intersection indicates a likely red light. Or the ranges to indicate walking speed, bus speeds, and car speeds. And how to turn that into survey questions... But first things, first.
Like sneaking candy from the kids' Halloween bags while they're asleep. Nothing like late night programming to send me on a chocolate binge...
It's about distilling some intelligence from raw data logs-- turning thousands of lat/long readings into some kind of narrative that doesn't involve eyeballing Google KML files and trying to interpret the nice green bands it draws over its satellite photos of Albany.
The goal is to identify "points of interest" and then use a survey generator of my own creation to ask the user what he/she did there. It's all about helping the user reconstruct travel itineraries and possibly get a better understanding of how the user gets from Point A to Point B to do Pointless Errand C.
It dawned on me that I'll have to grab some kind of external GIS data to interpret the data. It's one thing to calculate headings (i.e. 0 is North, 90 is East, 180 is South, 270 is West, and 360 becomes north at 0 again). Or speed (I'm assuming that the earth is flat here, since reconstructing topography from the altitude readings is even more annoying than it's worth.)
So what better source of familiar free GIS data than Census TIGER data. I started working with it around New Year's Day 2000. In fact, it was the discovery of this data set that made me seriously pursue my whole transit itinerary planner project a few years ago, which ultimately led to me returning to Albany and getting into the PhD program.
And just by comparing the lat/long coordinates of a specific GPS reading, I can now run a query and return the street block the reading was taken. Given that I'll have anywhere from a few to dozens of readings per block, and multiple blocks per street traveled on, this could mean a huge reduction in data.
Now to come up with rules. Like how long spent at an intersection indicates a likely red light. Or the ranges to indicate walking speed, bus speeds, and car speeds. And how to turn that into survey questions... But first things, first.
Like sneaking candy from the kids' Halloween bags while they're asleep. Nothing like late night programming to send me on a chocolate binge...
Subscribe to:
Posts (Atom)