Data Structures: a rant

(If you’re curious, the textbook from this class–and the source of the explanatory links I’m using in this post–is available online free here: http://interactivepython.org/runestone/static/pythonds/index.html)

– – –

I just got out of another Data Structures class. I zoned out halfway through, which is somewhat unusual for me. Happens more often in Data Structures, because the class starts at 8AM. (Fortunately, I think I found a less winding route to cut down on my time walking to class–hopefully the shortcut will make me less prone to lateness. I am not a morning person. I tend to dream about waking up and going to class. I’ve even slept through the 10AM labs, which is ridiculous and frustrating–but also kind of understandable, because they’re on the day after the 8AM class, and my brain is trying to get rid of its sleep debt.)

But that’s not really why I started tuning out the professor’s lecture today. It was the result of another impractical implementation on the projector, at which point a thought struck me:

If these data structures are so useful, why aren’t they part of the language already?

Python (which is what we’re using, 3.5 at that) is not a dinosaur, and most of the math for these data structures was done like fifty years ago. I’d understand if the point was to get C to be more efficient, but most people don’t even code in C any more. It’s not the right tool for very many of the jobs we have to do.

In fact, this class seems closer to a language design course than something practical to software development. I know that’s the point of the high-flown “computer science” department, but… come on. Even interviewers are getting the idea that this kind of question doesn’t matter that much in the real world. It seems more to me like something to be learned on the fly, when needed. Which in practice would probably just mean memorizing Cracking the Coding Interview because you need a job.

Why does this theory class have to be so… theoretical?

I wish the teacher would spend some time telling us when to use these structures, rather than just what they are and how they’re implemented. Otherwise I think this course may do more harm than good.

Why in the world would you ever use a singly-linked list? Most languages–and especially the ones most commonly in use–have array or list structures of their own, which are 1) optimized for you, 2) don’t require extra code, 3) have been tested far more thoroughly than your code ever will be and are thus more stable and predictable, and 4) don’t confuse the hell out of other people who come in and read your program!

Why in the world would you bother implementing a doubly-linked list either, unless you’re coding a programming language of your own?! None of this makes sense! We have a dedicated class for language design! We have a dedicated class for C programming! Why isn’t it in there instead?!

Ahem. *deep breath*

Hash tables are kind of cool, and so are binary heaps (although they’re less practical). My affection for their clever hackitude is rather stifled by the suspicion that, again, they’d probably be built-in structures if they were that useful. Like Python dicts–those are hash tables behind the scenes. Use those. Everyone knows what they are and you don’t have to code them yourself.

If you’re working with large amounts of data, hash tables and binary heaps could be useful. But the professor doesn’t talk about when, just how to use them. If you’re not working with big data, chances are you just need a dict or list, and can spare yourself and others the experience of trying to interpret your weird code later.

But the professor doesn’t talk about that kind of thing. I wonder if he’s getting homework assignments that use this stuff unnecessarily. He hasn’t been taking points off mine because my work still does what the spec asks for, as clearly/briefly/user-friendly-ly as possible. I haven’t been looking for ways one might use weird data structures in the assignments, so I don’t know if they’re designed to invite us to use them or what.

He also covered recursion, which I think is very useful–again, if you know when to use it, which depends both on the problem you’re solving and the language you’re using. I will casually use recursion to make my code cleaner-looking even if it isn’t always the most efficient option (in Python). But that’s for readability. Mostly I use it when I’m getting user input, and I stick the prompt in its own method and loop if I get bad input. That lets me put all the lengthy, messy error-checking off somewhere and the main program will get back a good value.

I think this practice is supposed to be kind of evil according to functional programming, because user input functions aren’t pure functions? Or maybe, because I’ve sectioned them off and make sure they return good values, they’re exactly what you’re supposed to do in functional programming. Don’t know yet. It works, anyway.

I still use “else:” almost exclusively for “Can’t Happen” errors. It saves me headaches when I screw up the recursion. That’s a minor change with using recursion, but it’s an issue only while writing; I’ll put in extra effort when I’m writing the program to make sure it’s easier to read later.

What I will not do is write code whose purpose is to make me look or feel clever for writing it a certain way. That’s insane. Sacrificing readability, even efficiency because you think it’s cooler to use some homebrew data structure code rather than a freaking built-in Python list? That’s absolutely insane.

When interviewers select for people who know how to do this, I wonder if they realize they may be selecting for a subset which includes the actual worst candidates: the smug “Of course I know everything, if you lesser mortals don’t, that’s your problem, Google it, and also I have very strong opinions about i++ vs. ++i and will totally correct you if you’re wrong.” Fools and incompetents may eventually learn. This person doesn’t think they have to.

(Of course, sometimes a person like that is useful to look impressive in customer meetings, but in my admittedly limited experience, they’re just as likely to insult your customers as to impress them. Ever seen a bunch of startups pitch? Some of those folks need an attitude adjustment. Of course, they’re going to fail; it’s hard to have enough empathy with customers to produce a good product if you have such scorn and disdain for their intelligence. Then they’ll claim they went under because they “failed to pivot.”)

Oh, one more thing. Big-O notation is pretty useful. Even if your main use for it is understanding what other people are talking about, and/or making yourself look impressive. It’s so you can find out which of two algorithms is more efficient, which will probably be something you’ll need to do eventually. It looks way more complicated than it is.

I could very well be wrong about this, and I’d be pleased to find out if I were. I’d love to find something that’d make me a better programmer. I’d be terribly pleased to find out I wasn’t learning something nearly useless. But I don’t think that’s the case.

The CS math course I have (enigmatically titled “Discrete Structures”… I have yet to find out why, unless the reason is “because it sounds fancy”) seems a little more practical at least. Logical thinking comes pretty naturally to me, but I know it doesn’t to everyone, so it makes sense to cover it. I know there’s also set theory, graph theory, and combinatorics later on in the course, and I’ve heard those are good things to know for programming. I guess I’ll find out.

Oddly enough, it’s my English course that’s the most practical. The professor decided to forego the traditional giant clunky writing textbook and told us to order this list of like eight different short story collections, which we read and then write and talk about. That takes up about half our class periods. The other half (each day is dedicated to one or the other) is the professor talking about writing: elements of stories in general that are important (like what the stakes are–is war at risk, for instance?), writing style (like sentence structure, word choice, voice, POV), what it means to have universal appeal, what it means to capture an audience and how to get their attention–and especially the skills of critical thinking and backing up your claims.

This format is a little odd for a class called “College Writing and Research”–but it isn’t bad. I think it’s more effective than the normal route. No writer I know learned writing out of a big clunky textbook, or by doing essays on symbolism in Dickens or whether school uniforms are a good idea.

Since CS folks often point and laugh at the liberal arts, saying their study is useless bull and all those English majors should be studying something useful and practical (like computer science of course), well… I have an uncomfortable piece of news.

Not that I’m saying majoring in English is more likely to get you a job (although it’s better off than, say, Gender Studies). But knowing how to write effectively, which my English class will teach you, is far more practical than knowing how to invert a binary tree.

Sorry, prof.

Please read the comments.

Advertisements

Pitfalls in learning to program

Tim H. commented:

[…] do you have any advice based on personal experience from when you first started coding? Things to do (or avoid doing) or perhaps something that you know now that you wish you had known when you first started?

Oh, please. I wish I could say that I took all the right turns while learning–if I had, I’d be a much better coder than I am–but that’s not the case.

This is probably more than you bargained for, but here–you’re now the… third? commenter to get a full post in reply to a question.

Building on your own

The biggest thing is that building stuff outside the books and classes is the most important thing to do. I’ve been coding in some form or another since I was about twelve: RPG Maker XP at 12, basic web design at 13-14, VB at 15, Java at 16, and then I went to college–I didn’t choose all my tools, most of this is an expression of what classes/books/programs were available to me at the time. But I usually didn’t go out and build stuff with what I learned in my classes.

I did try to make some games with RPGXP, but I was pretty awful at it because I didn’t know Ruby (the game maker has a Scratch-like interface that generates syntax) and the English documentation was very lacking (I looked). And I put together a few web sites for my parents’ birthdays that… were actually pretty nice-looking, but never got published. I was kind of lukewarm about VB, and I had a hard time coding in Java on my own computer for some reason–something about NetBeans not wanting to work.

–Basically, I tended to use the stuff I self-taught (RPGXP and HTML/CSS) more than the stuff I learned in classes (VB and Java). I did have a class on HTML/CSS, but not before I self-taught it, which was good because the class was horribly outdated. That repeated itself with the Linux class–which was taught very well and was up to date, but didn’t teach me as much as I’d learned by tinkering.

School isn’t as useful as you think

This pattern has held consistently through college: I learn better, and learn more relevant stuff, by screwing around with stuff on my own than by sitting in a class.

What I learned in class: Flash, old Dreamweaver (which was as bad as Flash), VB, Java, pseudocode markup, flowcharts (as if I couldn’t make those before), Windows networking, shell scripting, C#, .NET programming (sort of–the class had serious technical difficulties)

What I’ve self-taught: RPGXP (sort of), HTML/CSS, Linux, Python, working with an API, and a tiny bit of FileZilla and phpMyAdmin for freelance web work. Plus all the stuff I read online about programming and startups and businesses and new technology and other awesomeness of that stripe.

What I’ve used: the second list (minus RPGXP really)

What I feel like I know well: the second list (minus RPGXP again)

I’ve long maintained that I’m going to college to get a degree (formal credentials) and to meet people. I’m much less there to learn, because if that were my only priority, I could accomplish it more cheaply and efficiently by sitting around in an ethnic restaurant of my choice with some books and my laptop every day.

Although, probably the teacher who’s taught me the most is Mr. Noord, because he rarely stays on topic for more than half an hour, and the stuff he ends up talking about tends to be more useful and interesting. Don’t tell the school board.

It’s totally different for networking students though. Then classes are way more worthwhile because the school has the hardware they need to work with, and networking is easier to teach in a class.

Programming is an art form that can’t be taught in a class. Programming is a separate skill from knowing the syntax of programming languages, which can be taught in classes. Knowing the syntax is like memorizing a French textbook. But to learn French, you have to speak it (write programs), listen to other people speak it (read programs), let them correct your speech (take criticism from older programmers), learn the current grammar and slang rules (style and best practices), and keep using your skills.

Classes just don’t usually let you mess around with stuff, and when they do, they put lots of restrictions in place: how long you get to work, how fast you have to work, what you can make, and what other things you have to do or write in order to discuss what you learned so someone can prove you learned it and somehow quantify what you did. It’s a real pain. And then you have to pay for them, instead of getting to spend the money on new books or tools or hardware.

I’m not saying that schools can’t be useful. They can be! They can teach the basics and introduce you to other programmers, and they introduce you to some great teachers and other students. They just won’t teach you the sort of skill that makes a great programmer.

Books are quite a bit better, but still watch out

Books still won’t get you all the way; you have to build on your own. But, at least for me, they’re definitely more productive than classes, in terms of learning a new technology or language.

Books tend to a) be cheaper, b) invite you to mess around freely and without restrictions on time or what you do with the example code or stuff like that, and c) let you choose your tools. When you buy a book, there aren’t required classes. You get to study what you want.

The tools you choose are important

The main reason I’ve never used VB or C# or Flash or the other stuff I took classes on isn’t necessarily because the teacher was bad or the curriculum was awful. On the contrary–I’ve had mostly great teachers.

But the technologies are pretty crap, to be honest.

If you’ve been around very long, you know about my burning hatred of Visual Studio. I didn’t always hate it so much. I used to be lukewarm towards it, back in the days I was using VS 2012 for my Visual Basic classes. We had technical difficulties with it even then, and it liked to crash and all sorts of crap, although nothing like the temper tantrums it’s thrown in my more recent classes.

But now that I know what programming is actually like, I look back on it and realize how boring and tedious it was. It sucked! No wonder I wanted to go into psychology instead. I thought coding was cool, but sometimes I wondered if I just thought the idea of coding was cool and I liked the romantic techie-nerd image it let me imagine, because what I was doing was not that much fun.

(Straight-up HTML and CSS isn’t very fun either, to be honest. Once you get good at it, both the novelty and the challenge kind of wear off. I’m looking forward to learning to code dynamic web pages, though.)

Java was okay, but we didn’t get very far into it, the coding examples were largely busywork, and it was really complex. I think we just got a bad textbook, honestly.

The books you choose are important

The difference between a bad textbook and a good textbook is like the difference between a bad teacher and a good teacher.

Here are the different kinds of books:

Diving Into ASP.NET, MVC, TLA, GPS, and SAT by J. Random Microserf

This will probably have some geometric feature on the cover. It’ll look deceptively shiny and colorful, but it’s as engaging a read as the Terms of Use agreement for the technology it’s purporting to teach. Said technology will of course be the kind that has a Terms of Use agreement, because these books are usually made in order to sell both the book and the IDE or language or whatever the book is about, plus all the attached certifications (which probably all have test prep books of their own).

They’re written by someone who probably does not like the technology they’re writing about, even if they say otherwise in the intro. This will translate into the writing, and you will end up not liking it either.

Generally, these books are best used as eBay fodder or, if the technology described therein has already flunked out (it will soon if it hasn’t), as expensive firewood.

Learn Foo in 24 Hours!!!!

You will not learn foo in 24 hours.

You may get a basic but deceptively broad understanding of foo. But you will not have learned foo.

This book is not quite firewood. It may be helpful, if not so in-depth.

Weird homebrew-looking e-book sold on Amazon

?????

Head First Foo

Made by O’Reilly. This will have a lot of pictures and silly jokes. They’re great for learning from if you need to get a complete understanding of a technology quickly, or if you’re a beginner and you’re kind of intimidated.

I lose patience with them, though, under two circumstances:

  1. I have something I want to build, and I don’t want to sift through 50 pages of pictures of tigers and pizza before I find the bit of information I’m looking for. What I need is not exactly a reference book; I just want the info in a more dense and organized form.
  2. I’m trying to read it as an ebook. They don’t translate well to the format. Don’t buy these as ebooks.

Notably, they’re bad reference books, as you might have gathered. Don’t get them for that either.

Making Stuff with Foo

This could be “Building Dynamic Websites with Django”, “Creating Apps with Kivy” (<3), et cetera.

It’ll contain something like this in the intro.

I’ve been using this technology for foo amount of time, and I think it’s a great tool for doing bar because of how it bazzes the quuxilators. I tried some other ways of doing bar and overall, I like this one the best because it’s well-designed and has great features.

Anyway, I thought a lot about how to write this book, and you might notice that the example code (which is on GitHub at [URL], by the way) is actually not a bunch of different projects–it’s just a couple projects, and we change them over the different chapters so you can see the development cycle better.

I really hope you have fun making stuff with foo using this book. If you find any errors along the way, send me an email at foo@gmail.com.

Not exactly, necessarily, but approximately that tone–you get the idea. The tone of the book will be friendly, but practical and not too goofy. These are great books!

It has a picture of an animal on the front?

That is probably an O’Reilly book. It is worth its weight in gold.

These are very often the “Making Stuff with Foo” variety.

If O’Reilly doesn’t make a book about a technology that is well known enough to have school classes taught on it for, say, a year… that technology may not be worth learning. Unless you can find someone else who wrote a good “Making Stuff with Foo” book about it, and then it might be okay.

Dry Internet documentation

For when you already know what you’re doing. Confusing, frustrating, and boring if you don’t. Buy a book.

Good Internet documentation

Save your bookmarks!

Make stuff. Solve problems. Meet other nerds. Eat cake. Browse GitHub. Play a board game. Tinker with stuff. Install Linux on something. Eat cake again. Hang with your nerd friends.

Have fun.

Happy hacking!