Data Structures: a rant

(If you’re curious, the textbook from this class–and the source of the explanatory links I’m using in this post–is available online free here: http://interactivepython.org/runestone/static/pythonds/index.html)

– – –

I just got out of another Data Structures class. I zoned out halfway through, which is somewhat unusual for me. Happens more often in Data Structures, because the class starts at 8AM. (Fortunately, I think I found a less winding route to cut down on my time walking to class–hopefully the shortcut will make me less prone to lateness. I am not a morning person. I tend to dream about waking up and going to class. I’ve even slept through the 10AM labs, which is ridiculous and frustrating–but also kind of understandable, because they’re on the day after the 8AM class, and my brain is trying to get rid of its sleep debt.)

But that’s not really why I started tuning out the professor’s lecture today. It was the result of another impractical implementation on the projector, at which point a thought struck me:

If these data structures are so useful, why aren’t they part of the language already?

Python (which is what we’re using, 3.5 at that) is not a dinosaur, and most of the math for these data structures was done like fifty years ago. I’d understand if the point was to get C to be more efficient, but most people don’t even code in C any more. It’s not the right tool for very many of the jobs we have to do.

In fact, this class seems closer to a language design course than something practical to software development. I know that’s the point of the high-flown “computer science” department, but… come on. Even interviewers are getting the idea that this kind of question doesn’t matter that much in the real world. It seems more to me like something to be learned on the fly, when needed. Which in practice would probably just mean memorizing Cracking the Coding Interview because you need a job.

Why does this theory class have to be so… theoretical?

I wish the teacher would spend some time telling us when to use these structures, rather than just what they are and how they’re implemented. Otherwise I think this course may do more harm than good.

Why in the world would you ever use a singly-linked list? Most languages–and especially the ones most commonly in use–have array or list structures of their own, which are 1) optimized for you, 2) don’t require extra code, 3) have been tested far more thoroughly than your code ever will be and are thus more stable and predictable, and 4) don’t confuse the hell out of other people who come in and read your program!

Why in the world would you bother implementing a doubly-linked list either, unless you’re coding a programming language of your own?! None of this makes sense! We have a dedicated class for language design! We have a dedicated class for C programming! Why isn’t it in there instead?!

Ahem. *deep breath*

Hash tables are kind of cool, and so are binary heaps (although they’re less practical). My affection for their clever hackitude is rather stifled by the suspicion that, again, they’d probably be built-in structures if they were that useful. Like Python dicts–those are hash tables behind the scenes. Use those. Everyone knows what they are and you don’t have to code them yourself.

If you’re working with large amounts of data, hash tables and binary heaps could be useful. But the professor doesn’t talk about when, just how to use them. If you’re not working with big data, chances are you just need a dict or list, and can spare yourself and others the experience of trying to interpret your weird code later.

But the professor doesn’t talk about that kind of thing. I wonder if he’s getting homework assignments that use this stuff unnecessarily. He hasn’t been taking points off mine because my work still does what the spec asks for, as clearly/briefly/user-friendly-ly as possible. I haven’t been looking for ways one might use weird data structures in the assignments, so I don’t know if they’re designed to invite us to use them or what.

He also covered recursion, which I think is very useful–again, if you know when to use it, which depends both on the problem you’re solving and the language you’re using. I will casually use recursion to make my code cleaner-looking even if it isn’t always the most efficient option (in Python). But that’s for readability. Mostly I use it when I’m getting user input, and I stick the prompt in its own method and loop if I get bad input. That lets me put all the lengthy, messy error-checking off somewhere and the main program will get back a good value.

I think this practice is supposed to be kind of evil according to functional programming, because user input functions aren’t pure functions? Or maybe, because I’ve sectioned them off and make sure they return good values, they’re exactly what you’re supposed to do in functional programming. Don’t know yet. It works, anyway.

I still use “else:” almost exclusively for “Can’t Happen” errors. It saves me headaches when I screw up the recursion. That’s a minor change with using recursion, but it’s an issue only while writing; I’ll put in extra effort when I’m writing the program to make sure it’s easier to read later.

What I will not do is write code whose purpose is to make me look or feel clever for writing it a certain way. That’s insane. Sacrificing readability, even efficiency because you think it’s cooler to use some homebrew data structure code rather than a freaking built-in Python list? That’s absolutely insane.

When interviewers select for people who know how to do this, I wonder if they realize they may be selecting for a subset which includes the actual worst candidates: the smug “Of course I know everything, if you lesser mortals don’t, that’s your problem, Google it, and also I have very strong opinions about i++ vs. ++i and will totally correct you if you’re wrong.” Fools and incompetents may eventually learn. This person doesn’t think they have to.

(Of course, sometimes a person like that is useful to look impressive in customer meetings, but in my admittedly limited experience, they’re just as likely to insult your customers as to impress them. Ever seen a bunch of startups pitch? Some of those folks need an attitude adjustment. Of course, they’re going to fail; it’s hard to have enough empathy with customers to produce a good product if you have such scorn and disdain for their intelligence. Then they’ll claim they went under because they “failed to pivot.”)

Oh, one more thing. Big-O notation is pretty useful. Even if your main use for it is understanding what other people are talking about, and/or making yourself look impressive. It’s so you can find out which of two algorithms is more efficient, which will probably be something you’ll need to do eventually. It looks way more complicated than it is.

I could very well be wrong about this, and I’d be pleased to find out if I were. I’d love to find something that’d make me a better programmer. I’d be terribly pleased to find out I wasn’t learning something nearly useless. But I don’t think that’s the case.

The CS math course I have (enigmatically titled “Discrete Structures”… I have yet to find out why, unless the reason is “because it sounds fancy”) seems a little more practical at least. Logical thinking comes pretty naturally to me, but I know it doesn’t to everyone, so it makes sense to cover it. I know there’s also set theory, graph theory, and combinatorics later on in the course, and I’ve heard those are good things to know for programming. I guess I’ll find out.

Oddly enough, it’s my English course that’s the most practical. The professor decided to forego the traditional giant clunky writing textbook and told us to order this list of like eight different short story collections, which we read and then write and talk about. That takes up about half our class periods. The other half (each day is dedicated to one or the other) is the professor talking about writing: elements of stories in general that are important (like what the stakes are–is war at risk, for instance?), writing style (like sentence structure, word choice, voice, POV), what it means to have universal appeal, what it means to capture an audience and how to get their attention–and especially the skills of critical thinking and backing up your claims.

This format is a little odd for a class called “College Writing and Research”–but it isn’t bad. I think it’s more effective than the normal route. No writer I know learned writing out of a big clunky textbook, or by doing essays on symbolism in Dickens or whether school uniforms are a good idea.

Since CS folks often point and laugh at the liberal arts, saying their study is useless bull and all those English majors should be studying something useful and practical (like computer science of course), well… I have an uncomfortable piece of news.

Not that I’m saying majoring in English is more likely to get you a job (although it’s better off than, say, Gender Studies). But knowing how to write effectively, which my English class will teach you, is far more practical than knowing how to invert a binary tree.

Sorry, prof.

Please read the comments.

Advertisements