Neil will teach you five advanced website traffic statistics that you NEED to be measuring, but probably aren't. It isn't good enough anymore to just measure click-through and conversion rates to your signup page. You need MUCH more detail and Neil will explain how to get it and make decisions accordingly. You'll be amazed at the increase in valuable sign-ups and revenue increases you can achieve.
6. Lesson 3: Open Schema ID UPS DOWNS TITLE URL 12345 120 34 Buffins Create Zombie Dog! www.someaussiesite.co.au/dog.html 12346 3 24 Check out my new blog! noobspamer.blogspot.com 12347 509 167 Pee in a sink if you’ve ever voted up. self
7.
8. Lesson 3: Open Schema THING_ID KEY VALUE 12345 Title Boffins Create Zombie Dog! 12345 URL www.someaussiesite.com.au/zombiedog.html 12346 Title Pee in a sink if you’ve ever voted up. 12346 URL self ID UPS DOWNS TYPE 12345 120 34 Link 12346 3 24 Link Thing Data
Began as a way to share links Now we have thousands of communities, some for news, some behave like forums Premier way to waste time at work
Originally, we didn’t detect crashes very well. wake up every could of hours and check if things were working. friends would have to call me. I dreaded the sound of my phone ringing Ruined many a dinner trying to fix reddit. I had to run across the street to an Apple store to use a terminal to fix things. Bringing supervise into the mix fixed many of our woes. If the app died, supervise automatically restarts it. write scripts to detect weird states Running out of connections? Let the app server crash, all of a sudden you’ll have plenty of connections. Memory leak? Deadlocked thread?
One machine with web, app, db Things were slow, but it wasn’t clear why. Cpu wasn’t too high, memory was under control. Context switching killing us Adding that second machine was a huge breath of fresh air. learned this lesson multiple times over the years Not just separating services, but also data. Breaking apart links/comments gave a large performance increase. PostgreSQL is great, but it doesn’t like to share.
In the early days our schema looked like this. It’s fairly straightforward. Normalized Lots of foreign keys Complex many-to-many relationships A table for links, accounts, comments. Columns for each attribute.
Every type of data is divided into two parts: things and data. A thing table stores properties common to all types. Every type has ups, downs, creation date. The data table is just a list of key value pairs. To do queries against the data table, we have specific indices for each key type.
only one application server and developed all sorts of bad habits. The app server was an always-running Lisp process. We stored all sorts of state per user. We didn’t have memcached, just stored data in an in-memory hashtable. When we switched to Python, we preserved this. When we added multiple app servers, we were in a bad way Every app had to share this cache. We were duplicating our entire cache for each app server. Couldn’t use memcache right away because we had too many keys
All queries are generated by the same piece of code. Makes general caching simple. limited state we have gets put in memcache. Examples: password reset or captchas. Every element of every page is cached. group small elements into bigger pieces and cache blobs. Slow function? Memoize it. For example, the normalized hot page, or more complex database lookups. Memcachedb – Listings, comment trees, slow queries (by_url)
When we first began, we had a nice, consistent normalized database. This means we had to do a lot of work to get all of the right data together to render a page. This can be mitigated with caching, but it’s not a cure-all What we do now is more like pre-emptive caching. store complete listings, complete comment trees. A link might appear in a front-page listing or in a user’s profile page. A comment might appear with a link, or in a user’s inbox. Each of these is stored pre-computed and ready to go. For some listings we store as many as 15 versions with different sorts and different time periods.