memonic

Gamasutra: Mike Rose's Blog - Using SimCity to diagnose my home town's traffic problem

Save
Using SimCity to diagnose my home town's traffic problem

Mike Rose is Gamasutra's UK video game editor (and traffic correspondent, apparently).

After the first SimCity beta weekend had ended, I read an article by Norman Chan over at Tested.com in which he attempted to discover the best suburban city layout according to the game. It's hugely interesting stuff, and quite frankly I hadn't considered the idea of putting this new SimCity to real-world use.

So when the second beta weekend rolled around, I decided to have a crack at testing some theories that I have about my own hometown of Northenden, Manchester (population of roughly 15,000, as far as I can tell).

Northenden is a pretty small place that you can drive straight through in around five minutes. The town mainly consists of residental housing, with a strip of shops that is known at the center of the town. And yet, considering the low number of people who live here, coupled with the low number of reasons to want to be driving around the town, you still wouldn't want to visit during rush hour. We're talking standstill traffic that you can sometimes expect to sit around in for up to an hour.

Look at a map of the area,
and it's not hugely difficult to see why. The town features connections onto multiple motorways, plus the extremely busy Princess Parkway, which goes all the way into the center of Manchester city. To its North, you've got a straight road surrounded by golf courses, that leads to the bustling Didsbury areas.

Essentially, during both the morning commute and the drive back home from work in the evening, Northenden becomes a traffic bottleneck, with hundreds of cars either trying to get onto the motorway, or out into Didsbury. I have to pick up and drop my child off at nursery in Didsbury a couple of times a week, and quite honestly, the drive is like hell on Earth at the worst of times.

At least, this is my reasoning for the ugly traffic. In fact, I've contemplated numerous times exactly what causes the pileups in Northenden, sometimes wondering whether it's simply that the town is laid out in an awful fashion. Now, with the incredibly scientific power of SimCity, I will finally get to the bottom of what is the real cause.

Question 1: Is it the layout of the town that causes the traffic?

I began by sticking Google Maps on my second screen, and then building Northenden in SimCity as best as I could. I wasn't meticulous about the design - I mean, I'm not submitting this to the council or anything - but I made sure that the layout was roughly the same as the real-life town, especially when it came to the main through roads. At the bottom of the following screenshot you can see the town connecting up to Princess Parkway/the motorway, while at the top left, the road connects to Didsbury (or a railway track, as is the case here - more on that later).

northenden 1.jpgMy version of Northenden - click to expand

Switching to the zoning view shows exactly what types of housing you can see here. The blue indicates the line of shops through Northenden center, while the yellow to the right is Sharston Industrial Estate (I added this in since if I didn't have industrial buildings, not many people would actually choose to move into my town!). What's lovely is that my town's population appears to have peaked at around 18,000, which is only just slightly more than the real Northenden. I wish I could say that I did that on purpose.

northenden 2.jpgZoning in Northenden

There's a 4-way junction in the heart of Northenden that everyone just calls "Tesco", mainly because it has a big old Tesco store on the corner. This is one of the main areas where traffic usually builds up and causes the bottlenecking, so I decided to keep a close eye on how my SimCity traffic piled up there.

As it turns out, it wasn't too bad at all. During the morning rush hour, there was a slight build up but nothing too notable. I surely wouldn't have minded sitting in that level of traffic, and it's a country mile better than the real deal. In the below screenshot you can see the peak of the rush hour, with the robot store in place of where Tesco sits in real life.

northenden 3.jpgThe Tesco 4-way junction

This isn't hugely surprising. In its current form, the through-roads aren't really acting as through-roads, since Northenden residents don't have a huge number of places to go along those roads. No, what I really want to test is whether funnelling traffic through the center of Northenden causes jams.

Question 2: Is the issue that Northenden is used as a through-town?

The issue is that SimCity's maps are rather small - at least, in the beta - and as such, I had to improvise a bit. Where Northenden hits SimCity's outlying railway track is where a stretching single road reaches into West and East Didsbury, both very busy areas. In place of this, I built a long stretching road at a 90 degrees angle, parallel to the railway, to replicate the Didsbury area. I created a built-up area filled with shops, and dumped a large casino at the far end (there's a large casino complex in East Didsbury called Parrswood). I then stretched a strip of highway from the casino all the way round to the other end of Northenden - an extremely simple but near-faithful version of the real-world area.

northenden 4.jpgEast Didsbury and the Parrswood Casino

With this set-up in place, the hope was that Northenden residents would venture out to Didsbury en masse, and mix in with the traffic from the tourist traffic from the motorway, on their way to visit the casino. Within 10 minutes, the effects were clearly notable - as per the below screenshot.

northenden 5.jpgHow the traffic looks after adding East Didsbury into the mix - note the strips of red running through Northenden center and into Didsbury

Most areas of Northenden were fine, with green and yellow strips on the road signifying no-to-light traffic. However, much of the main road going through town is red or orange, meaning pile-ups and standstill traffic. In particular, the road into East Didsbury was red all the way down.

This definitely echoes the sort of traffic you'd expect to find during the rush hour - that latter road in particular can take a good half an hour to get down when things get particularly bad. In fact, this setup even highlights a little trick that I know of for dodging some of the traffic: when coming from East Didsbury (up at the top right) into Northenden (alongside that bright red strip of road), if you take the second left along Northenden high street, you can skim through some of the back roads and pop out in a relatively lighter area, backpassing the through-road slog. Look at the SimCity map, and it seems to work here too!

Jump ahead to the time when the real rush hour traffic starts to kick in, and that once easy going junction at Tesco is suddenly rammed with cars.

northenden 6.jpgThe Tesco 4-way junction after East Didsbury has been added

So what does this all prove? Is it indeed true that through-traffic coming from Didsbury and the motorway is causing the Northenden bottlenecks? And what can the council do to fix the issue?

Well... the answer is, of course, that it means absolutely nothing. This was but a mere video game experiment, and nothing here even closely resembles scientific evidence to support my theories, nor can it be used to diagnose the issues. Everything I did was hugely vague and nothing at all like real life.

That being said, this little project definitely gave me a sense for what could be accomplished with SimCity if put in the right hands. I noticed plenty of other little interesting takeaways from my hour of play, including the fact that once I'd built East Didsbury, the strip of shops in Northenden stopped making as much money as they once were, and some were even beginning to close down as my time ran out. Walk along Northenden high street, and you'll know that feeling.

For me personally, I can't wait to see what happens when the game is released, and people with real scientific experiments give it a run for its money.

northenden 7.jpgAnd this is what I'll have to work my way through to take my son to nursery in Didsbury tomorrow *sigh*

SXSW 2012: JavaScript Performance MythBusters (via JSPerf)

Save

SXSW 2012: JavaScript Performance MythBusters (via JSPerf)

Chris Joel
CloudFlare, Developer

John David Dalton
Uxebu, JavaScript Developer

Kyle Simpson
Getify Solutions, Owner

Lindsey Simon
Twist, Developer

Presentation Description

JavaScript is everywhere from mobile phones and tablets to e-readers and TVs. With such a wide range of supported environments developers are often looking for an easy way to compare the performance between
snippets, browsers, and devices. jsPerf.com, a site for community driven JavaScript benchmarks, was created to help devs do just that.

Join Mathias Bynens and John-David Dalton from jsPerf.com, Chris Joel from Cloudflare.com and Lindsey Simon from Google/Browserscope in this panel discussion on some of the best dev-created benchmarks and most interesting practices debunked by real-world tests.

Presentation Notes

Browserscope and jsPerf

Open-source, community-driven project for profiling browsers. Really good at helping inform developers by providing number crunching and actual data. The whole idea is that anyone can reproduce results with any type of hardware (crowdsourcing).

Explicit Goals:

  • Foster innovation by tracking functionality
  • Push browser innovation, uncover regressions
  • Historical resource for web developers

Myths

  1. Your for loops suck: rewrite all your code and use: while –i BUSTED
  2. Double your localStorage read performances in Chrome by getting by index. TRUE
  3. The arguments object should be avoided. BUSTED (but isn’t as good in Opera)
  4. Frameworks (like jQuery) are always better at managing performance than you are, so just use what they provide. BUSTED
  5. Converting a NodeList to an array is expensive, so you shouldn’t do it. For instance document.getElementsByTagName() returns a NodeList, not an array, and then iterating over it compared to an array after taking the performance hit of converting it. BUSTED (also see: Static node list, which is closer to a performance with an array)
  6. Script concatenation and/or <script defer> is all you need to load JS performantly (aka “Issue 28“). POSSIBLY. The average website has over 300K of JavaScript. The best thing to do with your JavaScript is to concatenate all your files, but then split them into about 100K sizes. This highly increases the speed at which your browser can download if you’re downloading these in parallel. Also, chunking up your code into pieces where you separate never-changing javascript with frequent you will help with browser caching. Lazy loading (pulling in the important file first and them the others).
  7. Eval is evil, it’s too slow and quirky to be considered useful. BUSTED The performance is pretty much equal with all benchmarks.
  8. Regular expressions are slow and should be replaced with simple string method calls using indexOf(). BUSTED Engines are getting faster now with RegEx.
  9. OO API abstraction means you never have to worry about the details (including the performance). BUSTED Your API design matters more than it just being OO.
  10. Type torsion (===) takes more processing power than a regular comparison (==). BUSTED There is a difference, but it’s so tiny you shouldn’t be concerned.
  11. Caching “nested property lookup” or “prototype chain lookup” helps (or hurts) performance BUSTED In most cases the browser engine already makes the cache, and this wont matter at all
  12. Operations which require an implicit primitive-to-object cast/conversion will be slower BUSTED For instance, when converting a number to a toString() or toNumber() it doesn’t affect performance
  13. Scope chain lookups are expensive BUSTED
  14. Use switch statements instead of if/else if for better performance. POSSIBLY. In most cases this is true, except in Safari and Mobile Safari. The panel recommended to just use what you need.
  15. Use native methods for better performance. BUSTED

Should You Eat While You Negotiate? - Lakshmi Balachandra - Harvard Business Review

Save

Should You Eat While You Negotiate?

Across cultures, dining together is a common part of the process of reaching negotiated agreements. In Russia and Japan, important business dealings are conducted almost exclusively while dining and drinking and in the U.S., many negotiations begin with "Let's do lunch." But are business deals actually improved when people discuss important matters over a meal?

To explore this question, I conducted two experiments. The first compared negotiations that took place over a meal in restaurants to negotiations in conference rooms, without any food to eat. In the second, negotiations were conducted with or without a meal in a business conference room. In the experiments, 132 MBA students negotiated a complex joint venture agreement between two companies. In the simulation, a provisional deal is in place, but a variety of terms must still be considered and agreed upon to maximize profits for their companies. The negotiators must determine how to handle each term of the deal. As is typical in many negotiations, in order to maximize their profits, the negotiators must share information and work together with the other side to learn where the most value can be created.

The greatest possible profits were created by the parties who were able to discern the other side's preferences and then work collectively to discover the profit maximizing outcomes for the joint venture, rather than merely considering their own company's profits. In the simulation, this can only be accomplished when the negotiators make trade-offs and then compensate each other from the net gains to the joint venture. The maximum value that can be created jointly for both companies is $75 million. Deals can be struck at lower combined values, down to as low as $38 million. To explore how eating together affected negotiation outcomes, I considered the total value created by both companies.

The students who ate together while negotiating — either at a restaurant or over food brought into a business conference room — created significantly increased profits compared to those who negotiated without dining. (Individuals who negotiated in restaurants created 12% greater profits and those who negotiated over food in a conference room created 11% greater profits.) This suggests that eating while deciding important matters offers profitable, measurable benefits through mutually productive discussions.

foodchart2.gif

I designed a third experiment to test if it was in fact the act of eating together and not merely sharing a separate task that led to the better negotiated outcomes. I had 45 MBA students negotiate the same simulation, but instead of negotiating while eating, half of the groups negotiated while completing a jigsaw puzzle that had nothing to do with the negotiation. In this experiment, I found that the negotiators who shared a common task did not create better negotiation outcomes than those who only negotiated the deal.

I expected that both sharing a meal and collaborating on an activity would increase trust between the participants — and perhaps that the cultural history attached to eating together would increase trust more than sharing other activities — but when I surveyed participants in both studies, the trust levels they reported did not increase.

Why else might eating together improve the outcome of negotiations? There may be biological factors at work. When the negotiators in my first two studies ate, they immediately increased their glucose levels. Research has shown that the consumption of glucose enhances complex brain activities, bolstering self-control and regulating prejudice and aggressive behaviors. Other research
has shown that unconscious mimicking behaviors of others leads to increased pro-social behaviors; when individuals eat together they enact the same movements. This unconscious mimicking of each other may induce positive feelings towards both the other party and the matter under discussion.

In future experiments, I will continue to explore the reasons why eating while deciding important matters increases the productivity of discussions. In the meantime, you would be wise to suggest "doing lunch" whenever you meet to negotiate.

More blog posts by Lakshmi Balachandra
More on: Negotiating

How Costco Became the Anti-Wal-Mart - New York Times

Save

Despite Costco's impressive record, Mr. Sinegal's salary is just $350,000, although he also received a $200,000 bonus last year. That puts him at less than 10 percent of many other chief executives, though Costco ranks 29th in revenue among all American companies.

"I've been very well rewarded," said Mr. Sinegal, who is worth more than $150 million thanks to his Costco stock holdings. "I just think that if you're going to try to run an organization that's very cost-conscious, then you can't have those disparities. Having an individual who is making 100 or 200 or 300 times more than the average person working on the floor is wrong."

TinyPNG - Shrink your PNG files

Save

How does it work? Excellent question! When you upload a PNG (Portable Network Graphics) file, similar colours in your image are combined. This technique is called "quantisation". Because the number of colours is reduced, 24-bit PNG files can be converted to much smaller 8-bit indexed colour images. All unnecessary metadata is stripped too.

this, is boomerang

Save

this, is boomerang

boomerang always comes back, except when it hits something.

what?

boomerang is a piece of javascript that you add to your web pages, where it measures the performance of your website from your end user's point of view. It has the ability to send this data back to your server for further analysis. With boomerang, you find out exactly how fast your users think your site is.

boomerang is opensource and released under the BSD license, and we have a whole bunch of documentation about it.

how?

  • Use cases — Just some of the uses of boomerang that we can think of
  • How it works — A short description of how boomerang works internally
  • Help, bugs, code — This is where the community comes in
  • TODO — There's a lot that we still need to do. Wanna help?
  • Howto docs — Short recipes on how to do a bunch of things with boomerang
  • API — For all you hackers out there

who?

boomerang comes to you from the Exceptional Performance team at Yahoo!, aided by the Yahoo! Developer Network.

where?

Get the code from github.com/yahoo/boomerang.

Measure Anything, Measure Everything « Code as Craft

Save

Measure Anything, Measure Everything

Posted by Ian Malpass | Filed under data, engineering, infrastructure

If Engineering at Etsy has a religion, it’s the Church of Graphs. If it moves, we track it. Sometimes we’ll draw a graph of something that isn’t moving yet, just in case it decides to make a run for it. In general, we tend to measure at three levels: network, machine, and application. (You can read more about our graphs in Mike’s Tracking Every Release post.)

Application metrics are usually the hardest, yet most important, of the three. They’re very specific to your business, and they change as your applications change (and Etsy changes a lot). Instead of trying to plan out everything we wanted to measure and putting it in a classical configuration management system, we decided to make it ridiculously simple for any engineer to get anything they can count or time into a graph with almost no effort. (And, because we can push code anytime, anywhere, it’s easy to deploy the code too, so we can go from “how often does X happen?” to a graph of X happening in about half an hour, if we want to.)

Meet StatsD

StatsD is a simple NodeJS daemon (and by “simple” I really mean simple — NodeJS makes event-based systems like this ridiculously easy to write) that listens for messages on a UDP port. (See Flickr’s “Counting & Timing” for a previous description and implementation of this idea, and check out the open-sourced code on github to see our version.) It parses the messages, extracts metrics data, and periodically flushes the data to graphite.

We like graphite for a number of reasons: it’s very easy to use, and has very powerful graphing and data manipulation capabilities. We can combine data from StatsD with data from our other metrics-gathering systems. Most importantly for StatsD, you can create new metrics in graphite just by sending it data for that metric. That means there’s no management overhead for engineers to start tracking something new: simply tell StatsD you want to track “grue.dinners” and it’ll automagically appear in graphite. (By the way, because we flush data to graphite every 10 seconds, our StatsD metrics are near-realtime.)

Not only is it super easy to start capturing the rate or speed of something, but it’s very easy to view, share, and brag about them.

Why UDP?

So, why do we use UDP to send data to StatsD? Well, it’s fast — you don’t want to slow your application down in order to track its performance — but also sending a UDP packet is fire-and-forget. Either StatsD gets the data, or it doesn’t. The application doesn’t care if StatsD is up, down, or on fire; it simply trusts that things will work. If they don’t, our stats go a bit wonky, but the site stays up. Because we also worship at the Church of Uptime, this is quite alright. (The Church of Graphs makes sure we graph UDP packet receipt failures though, which the kernel usefully provides.)

Measure Anything

Here’s how we do it using our PHP StatsD library:

StatsD::increment("grue.dinners");

That’s it. That line of code will create a new counter on the fly and increment it every time it’s executed. You can then go look at your graph and bask in the awesomeness, or for that matter, spot someone up to no good in the middle of the night:

Graph showing login successes and login failures over time

We can use graphite’s data-processing tools to take the the data above and make a graph that highlights deviations from the norm:

Graph showing login failures per attempt over time

(We sometimes use the “rawData=true” option in graphite to get a stream of numbers that can feed into automatic monitoring systems. Graphs like this are very “monitorable.”)

We don’t just track trivial things like how many people are signing into the site — we also track really important stuff, like how much coffee is left in the kitchen:

Graph showing coffee availability over time

Time Anything Too

In addition to plain counters, we can track times too:

$start = microtime(true);
eat_adventurer();
StatsD::timing("grue.dinners", (microtime(true) - $start) * 1000);

StatsD automatically tracks the count, mean, maximum, minimum, and 90th percentile times (which is a good measure of “normal” maximum values, ignoring outliers). Here, we’re measuring the execution times of part of our search infrastructure:

Graph showing upper 90th percentile, mean, and lowest execution time for auto-faceting over time

Sampling Your Data

One thing we found early on is that if we want to track something that happens really, really frequently, we can start to overwhelm StatsD with UDP packets. To cope with that, we added the option to sample data, i.e. to only send packets a certain percentage of the time. For very frequent events, this still gives you a statistically accurate view of activity.

To record only one in ten events:

StatsD::increment(“adventurer.heartbeat”, 0.1);

What’s important here is that the packet sent to StatsD includes the sample rate, and so StatsD then multiplies the numbers to give an estimate of a 100% sample rate before it sends the data on to graphite. This means we can adjust the sample rate at will without having to deal with rescaling the y-axis of the resulting graph.

Measure Everything

We’ve found that tracking everything is key to moving fast, but the only way to do it is to make tracking anything easy. Using StatsD, we enable engineers to track what they need to track, at the drop of a hat, without requiring time-sucking configuration changes or complicated processes.

Try StatsD for yourself: grab the open-sourced code from github and start measuring. We’d love to hear what you think of it.


Fully Automated MySQL slow log analysis on Amazon RDS

Save

At Memonic we rely on MySQL for most of our data storage. In any relational database system the correct creation of indices is important, otherwise queries will be inefficient and slow. The problem with that is, that the indices often are forgotten, especially when updating an existing query. As a tool to detect queries without proper indices MySQL offers the slow query log. All queries that take more than a certain time are logged there.

We host our platform in Amazon’s cloud. For database we rely on their on their Relational Database Service (RDS) service. As we don’t have root access to those machines we can’t just tail the slow log to see what’s up. Instead RDS optionally writes the slow log into a special system table. From there a query can be used to retrieve the data. See the Amazon RDS FAQ about how to configure the slow log on RDS.

For automated analysis of the slow logs we like to use mk-query-digest. This excellent utility groups all logged queries together by their general type and thus allows a big-picture overview. As an example take these three queries that may have been logged:

SELECT * FROM events WHERE user_id = 'foo';
SELECT * FROM data WHERE created_at >= '2011-12-13 10:12:00';
SELECT * FROM events WHERE user_id = 'bar';

These will be grouped together by mk-query-digest as just two queries:

SELECT * FROM events WHERE user_id = '?';
SELECT * FROM data WHERE created_at >= '?';

This is accompanied with how often each query type was executed, how long it took in total, etc. This is a great way to focus any optimization effort first on the queries that are actually used a lot.

Unfortunately mk-query-digest only works with the normal MySQL slow query log format and can’t access the proprietary table that Amazon RDS keeps. To work around this, we wrote the db2log.py script which we hereby release into the public domain.

#!/usr/bin/env python
"""
Queries the slowlog database table maintained by Amazon RDS and outputs it in
the normal MySQL slow log text format.
"""

import _mysql

db = _mysql.connect(db="mysql", read_default_file="/root/.my.cnf")
db.query("""SELECT * FROM slow_log ORDER BY start_time""")
r = db.use_result()

print """/usr/sbin/mysqld, Version: 5.1.49-3-log ((Debian)). started with:
Tcp port: 3306 Unix socket: /var/run/mysqld/mysqld.sock
Time Id Command Argument
"""

while True:
    results = r.fetch_row(maxrows=100, how=1)
    if not results:
        break

    for row in results:
        row['year'] = row['start_time'][2:4]
        row['month'] = row['start_time'][5:7]
        row['day'] = row['start_time'][8:10]
        row['time'] = row['start_time'][11:]

        hours = int(row['query_time'][0:2])
        minutes = int(row['query_time'][3:5])
        seconds = int(row['query_time'][6:8])
        row['query_time_f'] = hours * 3600 + minutes * 60 + seconds

        hours = int(row['lock_time'][0:2])
        minutes = int(row['lock_time'][3:5])
        seconds = int(row['lock_time'][6:8])
        row['lock_time_f'] = hours * 3600 + minutes * 60 + seconds

        if not row['sql_text'].endswith(';'):
            row['sql_text'] += ';'

        print '# Time: {year}{month}{day} {time}'.format(**row)
        print '# User@Host: {user_host}'.format(**row)
        print '# Query_time: {query_time_f} Lock_time: {lock_time_f} Rows_sent: {rows_sent} Rows_examined: {rows_examined}'.format(**row)
        print 'use {db};'.format(**row)
        print row['sql_text']

view raw db2log.py This Gist brought to you by GitHub.

It simply dumps the current contents of the RDS slow_log table in the same format that MySQL usually uses for their slow log file. This output can then be piped into mk-query-digest to generate a report.

We ended up doing just that in a daily cron job which sends a mail to our developers.

#!/bin/bash
(/usr/local/bin/db2log | \
    mk-query-digest --fingerprints \
        --filter '$event->{user} !~ m/^(bi|memonic)$/') 2>&1 | \
        mail -s "MySQL slow logs" root

# Rotate slow logs. Will move them into the backup table slow_log_backup. If
# that table exists it's overwritten with the primary slow log.
# So with this strategy we can still access yesterday's slow log by querying
# slow_log_backup.
mysql mysql -e 'CALL rds_rotate_slow_log'
view raw cronjob.sh This Gist brought to you by GitHub.

There is one line which needs further explanation: the filter. We filter out any slow log events that were triggered by either the bi or the memonic users. The former is used for asynchronous generation of some statistics and performance isn’t required for that. The latter we use for ad-hoc queries which we don’t need to optimize.

So there you have it: an automated mechanism to analyze slow MySQL queries. From time to time when deploying a new release a new slow query may pop up. But the next day we are informed about it and can fix the issue.

(1 - 10 of 1721)