My current theory is that programming is quite literally writing.

A bit off topic, but you triggered something I’ve been thinking about for a couple of years. That “spark” is fluency.

I swtiched jobs from being a computer programmer to being an ESL teacher in Japan. Japan is somewhat famous for churning out students who know a lot about English, but can’t order a drink at Mac Donald’s. We used to have a name for those kinds of people with regard to programming languages: language laywers. They can answer any question you put to them about a programming language, but couldn’t program to save their life. These people often make it past job interviews easily, but then turn out to be huge disappointments when they actually get down to work. I’ve read a lot about this problem, but the more I look at it, the more I realise that these disabled programmers are just like my students. They have a vocabulary of 5000 words, know every grammar rule in the book but just can’t speak.

My current theory is that programming is quite literally writing. The vast majority of programming is not conceptually difficult (contrary to what a lot of people would have you believe). We only make it difficult because we suck at writing. The vast majority of programmers aren’t fluent, and don’t even have a desire to be fluent. They don’t read other people’s code. They don’t recognise or use idioms. They don’t think in the programming language. Most code sucks because we have the fluency equivalent of 3 year olds trying to write a novel. And so our programs are needlessly complex.

Those programmers with a “spark” are programmers who have an innate talent for the language. Or they are people who have read and read and read code. Or both. We teach programming wrong. We teach it the way Japanese teachers have been teaching English. We teach about programming and expect that students will spontaneously learn to write from this collection of facts.

In language acquisition there is a hypothesis called the “Input Hypothesis”. It states that all language acquisition comes from “comprehensible input”. That is, if you hear or read language that you can understand based on what you already know and from context, you will acquire it. Explanation does not help you acquire language. I believe the same is true of programming. We should be immersing students in good code. We should be burying them in idiom after idiom after idiom, allowing them to acquire the ability to program without explanation.

So let’s read more good code.

via slashdot.

Posted

iCloud: a different kind

IIRC, Android sort of does what iCloud does, but iCloud is very different and IMHO a much better offering from a developer perspective. According to Apple’s website:

iCloud Storage APIs enable your apps to store documents and key value data in iCloud. iCloud will wirelessly push documents to a user’s device automatically and update the documents when changed on any device — automatically.

iCloud allows an app developer to treat the cloud as a natural extension of the iOS device. In-cloud storage is almost as easy to use as saving data locally, and you get push synchronization for free.

This is similar to Dropbox, but baked right into the system APIs. I don’t think there is anything like this in Android. You of course can hack something using Google Storage API or Amazon S3, but they are very different beasts from iCloud: while iCloud associate the storage with a user, Google and Amazon associate it with an app. That difference has a huge implication. iCloud makes it trivially easy to synchronize data for one app across devices of the same user. To achieve the same thing with Google and Amazon’s offerings, you would have to provide your own code to identify user, isolate data of different users in the cloud, and bring in your own synchronization/push magic. On the other hand, if you want to mine across data of all users – answering questions like the most-bookmarked pages – then you might feel iCloud’s extra layer of abstraction getting into your way.

Posted

OAuth

Ursula is a photographer, and she loves the photo house called Sophia’s. The place is nice and clean, the service is superb, and the price is reasonable. Since Ursula’s free-lance business is doing pretty well, she uses Sophia’s quite often and enjoys 20% discount as part of their VIP program.

It’s Christmas time, and Ursula receives lots of orders to take family portrait in customer’s houses. Running around like crazy to fulfill all the orders, she could no longer visit Sophia’s to pick up the finished prints.

There is this super nice and considerable customer named Claire who offers to pick up the prints on her own from Sophia’s. Ursula loves this idea. However, there is one problem. Sophia’s treats privacy vey seriously. They would not allow anyone to pick up Ursula’s prints without Ursula’s driver license. Ursula obviously cannot give Claire her driver license because, well, there is simply not enough trust in here and asking Claire to later return the driver license is also too much.

Ursula checks with Sophia about this situation, and is happy to find out there is another way to delegate access to her prints. She just needs to follow these procedures.

  1. Make sure Claire is registered at Sophia’s. This is to make sure if Claire does something bad, Sophia’s can track her down. This is one-time process.

  2. If Claire is indeed on file at Sophia’s, when she is ready to come in and pick up the pictures, she needs to have something called the Access Token. The Access Token basically says the token holder is authorized to access which resource during which time period. In this case, it should say the holder is authorized to pick up Ursula’s prints on her behalf.

  3. To get the Access Token, Claire needs a Request Token endorsed by Ursula for exchange. She first calls Sophia’s to get a general Request Token. Based on instructions on the Request Token, Claire asks Ursula to call Sophia’s directly to validate this Request Token. When the validation is successful, Sophia’s asks Ursula to call Claire and tell her the Request Token is now endorsed.

  4. With the endorsed Request Token, Claire then calls Sophia’s to exchange it for a long term Access Token.

  5. With the Access Token, Claire can visit Sophia’s at any time before the token expires to get the prints as authorized by Ursula.

Posted

On facebook's potential

Many assumes that Facebook has to rely on Google-style in-the-margin ads as its major revenue source. I believe Facebook has the potential to bring a new form of online ads, for example, a new interactive game-like campaign that virally spread across the friend network with those participants given incentives like Facebook credits.

It’s rather limiting to think Facebook as an advertiser like Google. The real potential of Facebook is way beyond this. Facebook is not just connecting people to their friends. It is a middleman – a platform that connects customers to vendors, fans to artists, and gamers to game developers, and there could be much more than these. Think iOS as a platform that connects mobile users and app developers and how it has worked out – and would have worked out even it was a standalone company. Maybe someone will develop an Amway-style direct selling platform on Facebook. Maybe someone could develop a CRM and compete with salesforce.com for SMB. Or maybe a Priceline clone with social features would emerge. And good thing is Facebook does not need to do all these. It just need to inspire a developer community around it and evolve its platform to support new possibilities.

It’s gold rush again, and like last time – it wasn’t the gold miners that get rich; it was the people who sold the miners and other gold rush followers the tools and supplies they needed.

Posted

Ars Thinks Google Takes a Step Backwards For Openness - Slashdot

 

That's all fine and well for Google and their endless buckets of cash, but what about other companies, or importantly startups who want to get into the game.

H.264 is a standard; not a de-facto, or "industry" standard, but one adopted by an international standards body with wide representation. It publishes specs. If you build a part to do something with H.264 video, as long as it conforms to spec, it will work with others' products. You know, like the way any unlocked GSM phone works on any GSM network that operates on the same frequency band. It's ideal for startups, because you only need expertise in your own narrow product field, not in the entire much broader space. To build say an innovative silicon decoder you don't need to know how to build an encoder, because the elementary stream conforms to the standard. You don't need to know whether it came off a disc or ethernet. And while you occasionally run into interop issues this is positively nothing compared to the alternative of having inhouse expertise for *everything*. Not to mention the cost of dealing with some hacker who thinks they're doing something smart in the encoder, blowing up your taped-out decoder you've sent off to fab!

Compared to other costs, licensing fees are fairly trivial. $100k doesn't even buy a competent engineer for a year.

H.264 is a standard and that means a lot! Google is sounding childish in its own wonderland of a world of openness without understanding what an international standard entails.

Posted

Google To Drop Support For H.264 In Chrome

Does Chrome really have the market share required for this move to have any effect on the decisions of web designers?

Yes. Chrome is rapidly eating market share: in just about 2 years since launch, it's at 13.5%. This is twice the share of Opera and Safari combined. But the decision to drop H.264 doesn't put Chrome "versus the world", as they already had Firefox and Opera in their camp (which also lack H.264). Opera + Safari + Chrome make over 50% of the browsers used today, in market share.

This is substantially different than the previous situation, where Google, Microsoft and Apple all had a H.264 browser, and Firefox looked like the odd one out, while Opera was quietly awaiting the market to decide (they'd have no choice but support H.264, if Firefox did it).

However, the battle is still not over for H.264. The common wisdom is that Google is pushing their WebM standard and that's why they drop H.264. If they really think it's that simple, they have not done their math right.

The growth is with mobile devices. The leaders among them is Apple with iOS, and Google with Android, both of which come with hardware support for H.264, and no WebM hardware support (future support in... theory, but I can say, count Apple out). So what are web content owners left to do? Maybe encode all content twice: WebM and then H.264. Imagine the hassle of, ironically Google's very own, YouTube, having YET another version of every single video they have in their library: FLV, H.264 and now WebM.

No, actually web authors will opt for the simplest choice, that's least amount of work: the same H.264 video everywhere, making use of hardware support for H.264 in mobiles, exposed via HTML5, and ... Flash on the desktop, which also support exactly the same H.264 videos.

So, in attempt to push WebM, Google may end up accidentally (or not..?) cementing Flash's position on the desktop as the video player for the foreseeable future.

I used to think Flash will considerably fade away once IE9 becomes mainstream (which comes with GPU accelerated renderer and H264 support), but now things are suddenly interesting again for Adobe.

A very insightful look at the issue. Thanks to the big mobile device market, this move would not harm H.264, but will help flash.

Posted

Cassandra vs MongoDB vs CouchDB vs Redis vs Riak comparison

Cassandra vs MongoDB vs CouchDB vs Redis vs Riak comparison

While SQL databases are insanely useful tools, their tyranny of ~15 years is coming to an end. And it was just time: I can't even count the things that were forced into relational databases, but never really fitted them.

But the differences between "NoSQL" databases are much bigger than it ever was between one SQL database and another. This means that it is a bigger responsibility on software architects to choose the appropriate one for a project right at the beginning.

In this light, here is a comparison of Cassandra, Mongodb, CouchDB, Redis and Riak:

CouchDB

  • Written in: Erlang
  • Main point: DB consistency, ease of use
  • License: Apache
  • Protocol: HTTP/REST
  • Bi-directional (!) replication,
  • continuous or ad-hoc,
  • with conflict detection,
  • thus, master-master replication. (!)
  • MVCC - write operations do not block reads
  • Previous versions of documents are available
  • Crash-only (reliable) design
  • Needs compacting from time to time
  • Views: embedded map/reduce
  • Formatting views: lists & shows
  • Server-side document validation possible
  • Authentication possible
  • Real-time updates via _changes (!)
  • Attachment handling
  • thus, CouchApps (standalone js apps)
  • jQuery library included

Best used: For accumulating, occasionally changing data, on which pre-defined queries are to be run. Places where versioning is important.

For example: CRM, CMS systems. Master-master replication is an especially interesting feature, allowing easy multi-site deployments.

Redis

  • Written in: C/C++
  • Main point: Blazing fast
  • License: BSD
  • Protocol: Telnet-like
  • Disk-backed in-memory database,
  • but since 2.0, it can swap to disk.
  • Master-slave replication
  • Simple keys and values,
  • but complex operations like ZREVRANGEBYSCORE
  • INCR & co (good for rate limiting or statistics)
  • Has sets (also union/diff/inter)
  • Has lists (also a queue; blocking pop)
  • Has hashes (objects of multiple fields)
  • Of all these databases, only Redis does transactions (!)
  • Values can be set to expire (as in a cache)
  • Sorted sets (high score table, good for range queries)
  • Pub/Sub and WATCH on data changes (!)

Best used: For rapidly changing data with a foreseeable database size (should fit mostly in memory).

For example: Stock prices. Analytics. Real-time data collection. Real-time communication.

MongoDB

  • Written in: C++
  • Main point: Retains some friendly properties of SQL. (Query, index)
  • License: AGPL (Drivers: Apache)
  • Protocol: Custom, binary (BSON)
  • Master/slave replication
  • Queries are javascript expressions
  • Run arbitrary javascript functions server-side
  • Better update-in-place than CouchDB
  • Sharding built-in
  • Uses memory mapped files for data storage
  • Performance over features
  • After crash, it needs to repair tables

Best used: If you need dynamic queries. If you prefer to define indexes, not map/reduce functions. If you need good performance on a big DB. If you wanted CouchDB, but your data changes too much, filling up disks.

For example: For all things that you would do with MySQL or PostgreSQL, but having predefined columns really holds you back.

Cassandra

  • Written in: Java
  • Main point: Best of BigTable and Dynamo
  • License: Apache
  • Protocol: Custom, binary (Thrift)
  • Tunable trade-offs for distribution and replication (N, R, W)
  • Querying by column, range of keys
  • BigTable-like features: columns, column families
  • Writes are much faster than reads (!)
  • Map/reduce possible with Apache Hadoop
  • I admit being a bit biased against it, because of the bloat and complexity it has partly because of Java (configuration, seeing exceptions, etc)

Best used: If you're in love with BigTable. :) When you write more than you read (logging). If every component of the system must be in Java. ("No one gets fired for choosing Apache's stuff.")

For example: Banking, financial industry

Riak

  • Written in: Erlang & C, some Javascript
  • Main point: Fault tolerance
  • License: Apache
  • Protocol: HTTP/REST
  • Tunable trade-offs for distribution and replication (N, R, W)
  • Pre- and post-commit hooks,
  • for validation and security.
  • Comes in "open source" and "enterprise" editions

Best used: If you want something Cassandra-like (Dynamo-like), but no way you're gonna deal with the bloat and complexity. If you need very good single-site scalability, availability and fault-tolerance, but you're ready to pay for multi-site replication.

For example: Point-of-sales data collection. Factory control systems. Places where even seconds of downtime hurt.

Of course, all systems have much more features than what's listed here. I only wanted to list the key points that I base my decisions on. Also, development of all are very fast, so things are bound to change. I'll do my best to keep this list updated.

-- Kristof Shameless plug: I'm a freelance software architect and consultant; have a look at my services! -->

What can I do to help you?

Posted

Explain REST

Martin Fowler gives a deep explanation of what is RESTful. The use of REST could be as raw as SOAP which only uses HTTP as a transportation layer between the client and one server endpoint, to mapping each resource to a dedicated URI, to mapping HTTP verbs and errors to HTTP methods and status code, and to finally using some HTML element to help discover service.

Posted