Entries (RSS)  |  Comments (RSS)

SXSW!

This post is bit of a diversion for this blog, which is usually me babbling on about tech.

SXSW starts this Thursday, and I’ll be attending. You can catch me speaking on a funny panel called “How not to be a Douchebag at SXSW”, which Ed Huntsinger is hosting. But hey, it won’t be about tech. So maybe you won’t like that.

SXSW Interactive is great, but it’s like work to me; I always have to be on and make connections with people for work. Sometime around next Tuesday, the music portion of the event will start, and it’s a blissful mix of music, people, and madness.

I’ve spent the last couple of days or so going through the torrents for SXSW Music. 646 songs on the first one, and another 309 on the second torrent.

You can grab both of them here if you want to download them. They’re huge, but worth it.

As is the case with most submissions, about 90% of it is crap, but if you’re willing to sit through and listen to the music there’s a wealth of wonderful artists here.

Here’s my top 15 or so, in no particular oder

Allo Darlin- My Heart Is A Drummer

Something like the 60’s, yet reminiscent of Sing-Sing and some of the more country inspired 4AD artists.

ELEW – Mr. Brightside

Eric Lewis, AKA ELEW is a mindblowing pianist, playing what he likes to call Rockjazz on piano, He’s played TED, our own DNA Lounge. This is his cover of Mr. Brightside by the Killers.

The Golden Filter – Hide Me

Blissful, female vocal based, bouncy disco-electro perfect for the dancefloor.

The Heavenly States – Oui Camera Oui

A terrible recording with little to no dynamic range that reminded me of early Julian Cope.

Kill The Noise – All Too Vivid

I’d expect to hear this song at 1015 (a house club here in SF) at about 3:30 in the morning. Incessant, bouncing, house beat with heavily vocoded nothingness on top. Who just bought an access virus? You did, yes you did.

Lights – Saviour

This is what it would sound like if Canada tried to produce the LA Wall-of-sound pop sound. Female Vocals, Big, big bridges and choruses. Expect to hear this on MTV soon. I love this piece of pop aside from the minor Auto Tune glitches that exist all over it. Nice harmonies, but the lyrics? Forget it. It’s pop.

Luder – Sing to Me

I think I picked these guys because I’ve been on a strange metal bent lately (I blame Jack Black and too many late night PS3 sessions of Brutal Legend). Unfortunately this sounds like a goth band trying to complete for vocal bandwidth with slayer. It sort of works.

Maren Parusel – Dear Love

The likeness of Maren’s voice to Feist is what does it for me here. Breathy female vocals on a solid rock background with minimal rhythm guitar. A string quartet pops up unexpectedly to lift the chorus.

Margaret Cho – Eat Shit and Die

Can’t go wrong with Marget Cho.

Minipop – Precious

Minipop is the sort of breathy, reverb laden beauty that attracted me to bands like Love Spirals Downwards and whatever Project Records was vomiting up in the 90’s. The thing here, is that Minipop throws away all of those pretentious for perfect, blissed-out shoe gazer wonderfulness. I saw them at the Independent in SF last year and fell in love with them on the first listen. It helps that the singer is a 5′2″ elf-like creature. Loads of delay, lots of chorus. Love it.

Noush Skaugen – Run Baby Run

Guitars meet keyboards with a driving bass that reminds me of early Jesus and Mary Chain, but there’s far too many drums to really make the comparison hold true. The sort of music you’d want as the wind rips through your hair at 80mph down the highway.

Resplandor – Downfall

A reincarnation of Slowdive. Wall of guitars everywhere. I feel like I’m back in 1992. Excellent stuff.

Ruby Isle – So Damn High (Will Eastman Club Edit)

Big, giant kick drum and vocal samples everywhere. Electro at it’s most base.

Sex with Strangers – New City Anthem

Think first album Shiny Toy Guns with far less production and a bit of Human League thrown in for fun.

Sofia Talvik – Jonestown

Think Suzanne Vega and the Sundays coming together, without all that needless strumming of the 12-string. I won’t go to see her play at SXSW, but I’d listen to it with the lights out.

Posted by John Adams on March 9th, 2010

Read Full Post  |  Comments

Speaking Engagements for 2010

Hello! It’s 2010, a new year and I’ve got some speaking engagements coming up, where I’ll discuss Twitter operations and scaling.

Web 2.0 Expo 2010, San Francisco, CA
May 3rd-6th
In the Belly of the Whale: Operations at Twitter

Chirp (Twitter’s Official Developer Conference)
San Francisco, CA
April 15th, 2010, 4:30PM (Hack Day)
Scaling Twitter: I’ll be speaking on some of the issues we’ve experienced in scaling twitter and you’ll get to meet members of our Ops team.

Velocity 2010
San Francisco, CA
June 22nd-24th, 2010
Waiting for proposal approval, but probably will be similar to my talk at Chirp

Posted by John Adams on January 25th, 2010

Read Full Post  |  Comments

convergence.

Engadget recently featured an article describing YouTube’s blocking of 1080p content from select sites which allowed users to display the content on televisions instead of their computer monitors, or sites which utilized the YouTube API. Like the Hulu block last year, blocking the PS3 from watching shows, it marked another moment when television content producers failed to ‘get it’. Their understanding of content in the face of their own dying industry is poor and misguided.

The blocks on both services are easily removed through the use of a proxy that can replace the browser’s header in the outbound HTTP request.

I have both professional and personal experience in media convergence; Throughout my career I’ve worked for three companies that did streaming video, from adult content (Gamelink), to mainstream media and Independent film (Ifilm/Viacom). On a personal level, the flooding of my loft space has forced me into some temporary housing where I currently cannot not install Internet or Cable service, and I’m forced into using the slow (but not entirely awful) landlord provided WiFi.

Initially the WiFi service was a nightmare, but after the introduction of a pair of Meraki mesh access points, I was able to boost the signal to the point where the PS3 and laptops in the living room could access video. Meraki’s hardware has proven to be excellent under poor signal conditions and simple to use.

On the big Samsung TV that I own, this leaves me with a few options for video at home:

  • Hulu/Youtube via the PS3
  • The same, via laptop
  • Pay-to-play via the Playstation Store

  • Pay-to-play via iTunes
  • Basic Cable (no DVR, no channels, no time-shifting)

Most of these are great options (basic cable not withstanding). Laptop based options require me to connect cables, to lose the use of my laptop for the duration of the show, and because of the way the Mac supports full-screen websites, I can’t use fullscreen and the laptop’s screen at the same time. The PS3 is slow to download (although some of the best video I’ve seen on my TV), all Laptop options inconvenient (because of the cables)

It’s not about the technology either; We have the technology! It all works, just not as smoothly as the experience of loafing one’s self in front of the TV and pressing a couple buttons on a remote.

Content creators should be making every attempt to make it easier to consume their content, with advertising. There’s a duality here, where the online video world treats the laptop as a 1st class citizen and the TV as a second class citizen, and vice-versa when it comes to the Big Media world of Television.

All of this is about money — whom is paid and whom is not for the big business of the media world. The blocking needs to stop, and ad revenues shared between the content creators and the new distribution world of digital devices connected to large screens. There is fundamentally no difference between a large monitor, and the large flatscreen in front of my couch.

Posted by John Adams on November 20th, 2009

Read Full Post  |  Comments

You! Stuck in a hotel? Want Wi-Fi?

Here’s two scripts you can run if you have a 3G card, and a few friends stuck in the hotel room with you who want WiFi.

First, turn your laptop into an access point. These scripts work on MacOS X running 10.5.4 or better — good luck!

I call this script ‘make-me-an-access-point.sh’:

#!/bin/sh

#Edit your /etc/rc.conf to set the option firewall_enable to YES
# Edit your /etc/rc.firewall to add lines:

/usr/sbin/natd -dynamic -interface ppp0   

sysctl -w net.inet.ip.forwarding=1

/sbin/ipfw -f flush
/sbin/ipfw add 1000 pass all from 127.0.0.1 to 127.0.0.1
/sbin/ipfw add 2000 divert natd ip from any to any via ppp0
/sbin/ipfw add 6500 pass all from any to any

ifconfig en1 192.168.1.1 up netmask 255.255.255.0 broadcast 192.168.0.255

dhcpd

and when you want this to go away, do:

laptop:bin jna$ cat undo-make-me-an-access-point.sh
#!/bin/sh
#
# undo the make-me-an-access-point script
#
sysctl -w net.inet.ip.forwarding=0

sudo /sbin/ipfw -f flush
sudo kill -9 `ps -efl |  grep dhcp | egrep -v VMware |  awk '{ print $2 }'`

echo "Now turn off the Airport, then wait a moment, and turn it back on."

The contents of my dhcpd.conf are very simple:

# dhcpd.conf
#
# Sample configuration file for ISC dhcpd
#
ddns-update-style ad-hoc;

# option definitions common to all supported networks...
option domain-name "retina.net";
option domain-name-servers 209.183.50.151;
default-lease-time 600;
max-lease-time 7200;

# If this DHCP server is the official DHCP server for the local
# network, the authoritative directive should be uncommented.
authoritative;

# Use this to send dhcp log messages to a different log file (you also
# have to hack syslog.conf to complete the redirection).
log-facility local7;

# No service will be given on this subnet, but declaring it helps the
# DHCP server to understand the network topology.

subnet 192.168.1.0 netmask 255.255.255.0 {
  range 192.168.1.100 192.168.1.200;
  option routers 192.168.1.1;
  option domain-name-servers 206.13.28.12;
  option broadcast-address  192.168.1.255;
}

Posted by John Adams on July 10th, 2009

Read Full Post  |  Comments

Two great talks for Wednesday

Two great presentations I’ve recently read and want to share with you:

Tim O’Reilly and John Battelle’s “Web Squared: Web 2.0 Five Years On“.

Read that, perhaps while watching/listening toKevin Kelly’s riveting talk (from the TED EG conference) on the next 5000 days of the web (that’s right, we’re only 5000 days old) and the future of the Semantic web. Kelly says we’re building not a series of small machines on the Internet, but one gigantic thinking machine, approaching the connectivity level of the human mind.

Reblog this post [with Zemanta]

Posted by John Adams on July 8th, 2009

Read Full Post  |  Comments

Velocity 2009

Last Tuesday, I was part of the Velocity 2009 Keynote, where I gave a talk entitled, “Fixing Twitter”. I covered the last year or so of work in improving Twitter to deal with the massive traffic and user loads we’ve been under and how we use metrics to destroy the fail-whale.

Details of the talk are available on the Veloctiy 2009 site.

You can Watch the presentation (off of blip.tv) and download a PDF containing all of the slides here.

Update: It looks like blip.tv and O’Reilly moved some links around. Page updated.

Posted by John Adams on June 28th, 2009

Read Full Post  |  Comments

Predicting the End of the World with Mathematica

Frequently you want to predict when things are going to happen, and if it’s not the end of the world, it might be something occurring a bit sooner, such as your disk filling up.

First capture some data, with cron. We’re going to capture the free space in our database, once a day, so we’ll put something like this in cron, and set it for every night at midnight:

0 0 * * * ls -l /var/log/somefile | tail -1 >> /tmp/somefile_log

Wait a few days. We were looking at daily file growth, so we waited a full week to collect data. We had the luxury at the time.

Now you’ll have a file with a series of ls entries in it. Run those through awk, and capture the sizes of the files.

cat /tmp/somefile_log | awk '{ print $5 }'

At this point, it’s time to fire up Mathematica. Mathematica is a stunning piece of software by Wolfram Research, used for data visualization, scientific work, and in a number of industries.

First, let’s load the data into Mathematica.

What we’re going to do now is to copy the 1st data point into the Fit, and create a function that will allow us to predict the future.

(* Fit data to a curve using a polynominal model, make sure you insert the 1st data point or the curve fit will be bad *)
result =  Fit[data, {336660004864, x, x^2}, x]

(* current free space on our partition (312334824), refunding the 1st data point as that \
space is already alloc'd *)
diskfree = (312334824 * 1024) + Take[data, 1]

(* use the fit function to find doomsday *)
diskfreefunc =  diskfree - result

(* when x = 0 , we are dead *)
deathday =  NSolve[diskfreefunc == 0, x]
deathday = Take[x /. %, -1]
deathday = deathday[[1]]
DatePlus[datastartdate, N[deathday]]

So now we know when this data set will hit zero, we have the date of that failure, and an ability to graph when that will happen.

For this example, my data set turns into:

We now know that in ~64 days, we’ll run out of disk space. Prediction is pretty nice, huh?

For you stats types, you’ll want to know how good the fit is for this curve, and for that, we look at R2.

Posted by John Adams on June 25th, 2009

Read Full Post  |  Comments

Memcached and MySQL – What good is it?

I posted this in response to a post on GigaOM, but it was such a long comment, I felt that it was worthy as a post on it’s own.

The workloads of social networking sites fall mostly into the ‘read lots, write once’ class (most of the web exists within this paradigm.) Regardless of the database company that’s responsible for the software, the main idea in scaling this read heavy workload is to remove the burden from the database and move it to distributed memory stores.

As an engineer, you want applications to pull from the same cache pool to reduce I/O pressure. To ensure that every machine isn’t replicating data in individual caches, you have to go distributed. That’s the win with memcached.

Putting a distributed cache between the application and the database increases performance and shares data across your application servers, something that the database cannot do on it’s own. The database has on-disk and in memory caching, but eventually you’ll run out of memory on a single host if your working set exceeds the host’s memory.

Memcached also covers up replication lag (MySQL is terrible at replication, Oracle not so much) in large environments by putting data into the distributed cache (Write-through caching) before the slave database has finished it’s writing. Data is available immediately to clients, before the replication has completed.

It will also provide a large amount of savings when you’re constantly executing that O(n x m) query to find out who is friends with whom on your social networking site.

This comes with a cost, though. Relational database functions, like joining across large data sets, and atomic operations, become very difficult to execute. Memcached becomes the central server, and there is always a fear that an important key will drop out of cache because of a random eviction.

It’s not without risk, either. Dependence on the cache can hurt you severely if lots of memcached servers fail (and they do fail), Leaving you in a ‘cold cache’ situation where it can take hours to repopulate your working set back into the cache pool.

Don’t question MySQL’s performance — relational databases are great, but they are not the only solution to storage problems. the two problems that are being solved here are, highly orthogonal.

I’d also like to state that the majority of alternate key-value store databases listed in Richard Jones’ article and in Lenoard Lin’s blog are really not ready for high production loads (with maybe the exception of Tokyo Cabinet, HDFS, and Cassandra). There is still a ton of ’secret sauce’ the large sites are keeping quiet about in order to make these into effective data stores.

Lin states this in his review as well: “Your comfort-level running in prod may vary, but for most sane people, I doubt you’d want to.”

Tread lightly.

Posted by John Adams on May 17th, 2009

Read Full Post  |  Comments

Announcing mod_memcache_block

I’m announcing the release of mod_memcache_block, a distributed IP blocking system for Apache, with rate limiting based on HTTP request code.

For many years I’ve had a need for a module like this — A distributed blocking system which could operate across large web serving clusters and register hits in a central store. With rate limiting, incrementing counters on a single host is fairly useless when you have hundreds of servers behind a load balancer.

An attacker could hit many machines within the limit period before being detected, because there would be no central count. By keeping the counts in a memcache pool, all servers share the same data.

It won’t defend against attacks coming from random proxy addresses (say, Tor), and might unfairly count hundreds of users who live behind a single proxy (like corporate NAT), but it offers some protection against attacks coming from a single source IP.

The software is released under the Apache 2.0 Open Source License.

From the docs:

mod_memcache_block is an Apache module that allows you to block access to your servers using a block list stored in memcache. It also offers distributed rate limiting based on HTTP response code.

FEATURES

Distributed White and Black listing of IPs, ranges, and CIDR blocks
Configurable timeouts, memcache server listings
Support for continuous hasing using libmemcached’s Ketama
Windowded Rate limiting based on Response code (to block brute-force dictionary attacks against .htpasswd, for example)

REQUIREMENTS

libmemcached-0.25 or better
Memcached server
Apache 2.x (tested with 2.2.11)

Source code is available here:
http://github.com/netik/mod_memcache_block

If you would like to work on mod_memcache_block, contact me with your GitHub username and I’ll give you commit access on github.

Posted by John Adams on May 7th, 2009

Read Full Post  |  Comments

Velocity Preview

There’s a small interview with me in today’s O’Reilly radar, where I talk about some of the things that I’ll be presenting as part of my Velocity 2009 talk. You can listen to, and read the transcript here:

Posted by John Adams on May 7th, 2009

Read Full Post  |  Comments