Steve Dougherty's Blog

Freenet

Freenet is a medium for censorship-resistant communication. It allows people to communicate by publishing and retrieving data securely and anonymously. When someone runs Freenet on their computer it is called a node. Each node connects with a limited number of other nodes. When two nodes are connected they are one another’s peers. Every node communicates with the rest of the network solely through its peers.

Each node has some amount of storage reserved for a datastore. A datastore is a shared space in which each node keeps data. Freenet can be thought of as a distributed, encrypted storage device. It allows inserting data into and fetching data from the network-wide datastore made up of all the individual nodes’ datastores.

In order to do this, Freenet must be able to determine which nodes to store data on, and later be able to find that data again. The process of finding a piece of data, or a place to store it, is called routing.

This is where math comes in. In graph theory, there is a type of network called a small-world network. A small-world network contains relatively short routes between any two nodes. This is good, because longer routes are slower and less reliable. Some types of small-world networks are especially interesting because they allow finding short routes with only locally available information. This is essential for Freenet because its nodes must perform routing with only locally available information through their limited number of peers.

Here’s the concept: all nodes have a network location, which is unrelated to geographical location. An inherent characteristic of every request sent into the network is that it has an ideal location to be routed to. Nodes route requests by giving them to their peer whose location is closest to that ideal location. In order for this to be effective, the network must have a specific characteristic: it must have a good distribution of “link lengths,” which are differences between the locations of connected nodes.

Locations can be thought of as wrapped around a circle: 0 at one point, approaching 1 as it goes around, then wrapping back to 0. 0.3 is 0.2 away from 0.5, and 0.1 is 0.2 away from 0.9. This distance between peers’ locations is called the connection’s link length. On average, nodes must have many connections with shorter link lengths, and a few connections with longer link lengths. One can think of this as being able to quickly make large leaps on the location circle and also make small adjustments.

Depending on the network security level, a node can run in “opennet” mode. It will connect with nodes run by untrusted people the node’s operator does not know, called strangers. This is in contrast to the preferred mode of operation, called “darknet,” in which the node only connects to people the node operator knows in person, at least enough to be pretty sure they aren’t a secret agent or incredibly bad at securing their computer.

Okay, so this is all fine and good, but so what – what can it do? The most straightforward use is to insert a file and share the key with others so that they can retrieve it. The problem becomes how to tell other people. If all Freenet can do is act as a file storage device in which one can only retrieve files one already knows about, Freenet can’t do much.

While this is a limitation, many people have still built useful applications on top of Freenet. They tend to use a web of trust to discover files inserted by identities people create, and put together those files to present a responsive user experience locally. They assemble something like the database that would usually be on a centralized server locally from fetched files.

A bunch of plugins and external applications allow interactive communication over Freenet. There’s real-time chat, email, a cross between Twitter and a Facebook wall, and other applications which provide completely decentralized forum systems.

I collect and analyze data about Freenet to provide estimates of things like the size of the network and help better understand the network’s behaviour. Feel free to take a look! The links in the footer starting with “USK@” are links to things in Freenet and won’t work here.

URXVT

URXVT is a lightweight terminal emulator, (with an equally excellent page on the Arch Wiki) but I didn’t like the default color set, especially when using WeeChat. Here is why:

After putting the scrollbar on the right, using a larger XFT font, and using the same color set as Gnome Terminal’s “Tango” theme, things look much nicer:

Here’s the .Xresources to do so:

URxvt.background: #300a24
URxvt.foreground: #FFFFFF

URxvt.font: xft:DejaVu Sans Mono:size=12
URxvt.iconFile: /usr/share/icons/Humanity/apps/48/terminal.svg
URxvt.scrollBar_right: true

! gnome-terminal Tango theme
! black
URxvt.color0 : #2E2E34343636
URxvt.color8 : #555557575353
! red
URxvt.color1 : #CCCC00000000
URxvt.color9 : #EFEF29292929
! green
URxvt.color2 : #4E4E9A9A0606
URxvt.color10 : #8A8AE2E23434
! yellow
URxvt.color3 : #C4C4A0A00000
URxvt.color11 : #FCFCE9E94F4F
! blue
URxvt.color4 : #34346565A4A4
URxvt.color12 : #72729F9FCFCF
! magenta
URxvt.color5 : #757550507B7B
URxvt.color13 : #ADAD7F7FA8A8
! cyan
URxvt.color6 : #060698209A9A
URxvt.color14 : #3434E2E2E2E2
! white
URxvt.color7 : #D3D3D7D7CFCF
URxvt.color15 : #EEEEEEEEECEC

Copy and paste with URXVT took a bit of getting used to. Without additional setup, it requires using the Xorg paste buffer: selecting text copies; middle (scroll wheel) click pastes.

More Freenet Stats

Freenet is decentralized, so while it’s (intended to be) a small-world network and thus short routes exist between any two nodes, it can be difficult to have a routing algorithm which can find those routes using only local information. As far as I understand it, Jon Kleinberg’s work (brief, paper) forms the basis of Freenet’s networking model. We don’t have local connections, as they’re expensive to form and in empirical testing actually detrimental to performance, but apparently the model still holds. A core finding from this work is that if connections are distributed so that more-distant connections are less likely, an implicit structure forms which allows forwarding a message to the closest peer at each hop to form a shortpath. As Freenet uses 1-dimensional locations, this distribution is based on a probability which is proportional to the inverse of the distance. The ideal distribution is logarithmic, but from what I’ve gathered, Freenet’s distribution isn’t too close to it. Making the actual match the ideal is difficult – the network size is a factor in the distribution, but it cannot (practically) be known. Techniques by Oskar Sandberg (paper) are intended to produce this distribution without such knowledge, but Freenet’s implementation seems to not behave as intended. I hope to help discover why, and how to fix it.

Edit May 2, 2012: I replaced a rather dubious ideal distribution plot made with a quick Python script with a much better-looking one made with a real simulator. Here are both distributions on the same plot:

Edit August 1, 2012: Corrected Y axis label to refer to the percent of links, not nodes.

Adventures in Python

I’ve been spending most of my waking hours with Python over break, and I really like the language. Unlike the standard C++ library schoolwork is limited to, in Python I can generally find a library to make my task a great deal simpler. I find assumptions that I make about syntax while figuring things out tend to hold and work as I would expect, and it’s incredibly convenient to pop open an interactive shell to try out an idea before dropping it into a larger program. I actually like the whitespace-sensitivity of Python due to the rudimentary level of organization, style, and readability it provides. It seems like there’s much less boilerplate code and syntax compared with something like C++. That said, it can be odd to have it be an open question what type something is or what attributes it has and have that lead to problems. It can be frustrating to change code and not know if the types are right until that part runs. These are problems which would not be present in a statically typed language, but such a language will probably not be so flexible.

I’ve managed to get one project into a state in which I’m willing to show it the light of day: RelayBot. Not finding a working IRC bridge bot, I worked off an existing (but for me non-functional) implementation which heavily informed its design. I built my version by removing parts until it connected properly, then writing more functionality and removing still more until it did what I had in mind. I hope to use it to bridge a channel on FLIP and Irc2P.

The project (again in Python) which is not yet ready is a network probing and analysis application. It collects network topology information (optionally in a threaded fashion) and commits the results to a sqlite database for later analysis. It’s hoped that this will allow evanbd to replace a collection of Bash scripts which take an incredibly long time to run and are prone to breaking. The basic functionality is there, but it has many rough edges still. I’m partial to the peer distribution graph:

GNUPlot really does give lovely images. What I find interesting about this is how there are clear peaks – many nodes claim 12 or 36 peers, which seems very likely to be a function of the peer connection caps and bandwidth limits. There were some outliers, with one node claiming 92 peers! What’s encouraging is that this overall pattern seemed quite stable even as many more probes were collected.

This project has made clear to me how much I need to learn SQL properly. I initially wrote a collection of three queries to generate this: one query retrieved keys which were used to iterate over the other two. Generating this graph took about two hours. I figured out how to rewrite it to use the proper SQL commands for getting the result, and the exact same graph generated in approximately 30 seconds! What’s more, there’s a command I’d like to write that I don’t know how: “Take the sum of the count of the distinct traceNums for each probeID.” It sounds so SQL-y I’m not sure quite how I haven’t been able to do so.

It’s been a fun break, and a shame it couldn’t last longer. Learning in this kind of an organic way with immediate results and self-demonstrating practicality is fantastic.

Hard Drives

It’s not a question of if, but when. Hard drive failure has been a large part of my life recently: this server, my desktop, my roommate’s laptop – and all I can easily do is keep an eye on smartctl. More realistically, I should likely configure smartd to do it for me, but that’s for another day. Resizing an encrypted partition is rather…. manual. Apparently the way to extend a partition in fdisk is to delete it and recreate one with the same type and starting position but farther endpoint. Nerve-wracking. At least I have yet to lose data to hard drive failure. Part of it’s being careful – backups; checking drive health – and part of it’s luck. My roommate’s laptop hard drive died completely and without warning. Storage is fragile.

Fun with Linux

I removed my dad’s Linux installation; it was more than two years old and he wasn’t using it, so it just took up half his hard drive as that’s how we had partitioned it. Getting rid of GRUB was the first step, so I booted into the XP recovery console from the installation disc. I was prompted for an Administrator password, but it turned out to be blank, so I just hit enter. Woo, security. I didn’t run bootcfg /rescan or fixboot; fixmbr alone was enough to do it. It successfully booted with the Windows bootloader upon restarting, so I used the XP partition manager to remove the Linux partitions. I couldn’t seem to remove the extended partition, nor resize the volume, which made sense because the extended partition was there. I booted up into System Rescue CD, fdisk’d away the extended partition, then fell back to GParted to expand the single remaining partiton and its filesystem to fill the drive. Presto: double the free space available to Windows!

I also decided to try to dual-boot Debian Squeeze and Debian Wheezy on my netbook. This is because in Physics 260 we use python-visual in our computer homework, and the version in Squeeze has a problem that results in simple renderings containing, for instance, nothing but a sphere and a box taking seconds per frame. The error message i915_program_error: Exceeded max instructions is also emitted. I used the Debian installer’s guided full-disk encryption to set up this machine, so I have an ext2 partition mounted as /boot, then logical volumes for /home, /, and swap within an encrypted LVM. I wasn’t sure if two installations of Debian sharing a /boot partition was a good idea, but I assumed it wasn’t and so halved the existing one and added another ext2 partition for the new installation. I wonder if ext3 is a more dependable choice. Then I had to make another logical volume in the encrypted volume group for use as / for the Debian Wheezy installation. After poking around online to get an idea of what to do, I booted into a LiveUSB, started with cryptsetup luksOpen to open the encrypted container. Then vgscan to find volume groups, and vgchange -a y to make the logical volumes available. LVM is an alternative to partitions, so I then shrank my /home with resize2fs, then shrank the logical volume around it with lvreduce. It was a little scary when resize2fs and lvreduce appeared to treat units differently, but it seems to have been fine. If my understanding is correct, resize2fs reports size in 4kiB blocks (which it prints as 4k), and lvreduce speaks of base 10 units, yet seems to mean base 2. lvcreate was the easy part.

Amazingly, my system still booted after all this, so I installed Debian Wheezy. It took some fiddling to get partman to recognize the contents of my logical volumes. I had to trigger loading cryptsetup by going into the encryption setup, (IIRC pressing finish, same for LVM) then used cryptsetup, vgscan, and vgchange as before, then going out of and back into partman. The bootloader failed to install, but I continued without it and ran update-grub once back in Squeeze and although it detected Wheezy in the LVM, the entry it generated wouldn’t boot because it fails to prompt for the crypt container’s passphrase. I’m not sure why; that’s about as far as I’ve gotten. I tried without a separate boot partition and as would be expected it couldn’t even find the kernel. There are wishlist bugs filed in Debian about the installer’s support for encrypted LVM: #498199, #529343, and #566497 to name a few. I hope I can figure this out, but I don’t feel comfortable spending a great deal of time on it. I may just install to my flash drive.

Edit Jan 11, 2012:
I got it working! It wasn’t prompting for the passphrase because it was missing /etc/crypttab. Once I added that and chrooted in from the installer’s rescue mode to generate a new initramfs with update-initramfs -u, it worked! Hooray!

Umich AFS

The University of Michigan does not offer support for mounting ITS AFS space from non-university machines. I was able to set it up after a bit, and thought I’d document it here in the hopes that it’s helpful. For Ubuntu, the packages are openafs-client, krb5-user, and openafs-krb5. The realm name is UMICH.EDU – case matters. Next, as the official docs say: run kinit <username>, (assuming your local username differs from your uniqname) which will prompt for your password, followed by aklog. You should now be authenticated as yourself with your home directory at /afs/umich.edu/user/<username first letter>/<username second letter>/<username>/.

Stuff

It seems clear in retrospect that my Windows installation died due to a failing hard drive; I RMA’d it after the self-test reported read failures. I hope to build a RAID array at some point, but it’s somewhat out of my price range currently to buy a couple of drives.

College costs too much – when I talk to people who went to college in the 70s, they tell me how they were able to pay for college with a part-time job. What happened?!

I’m trying to figure out if over the summer (hopefully as part of GSOC!) I could write a filesharing application for Freenet – one that would make Freenet about as easy as Vuze to find and download stuff. User interface would be incredibly important in this effort, as features are not sufficient for – and too often impede – intuitiveness. My target audience is people who aren’t computer-savvy. I’d want to use a lot of jQuery to make it more like a desktop application. When I made my chat plugin using Toadlets it was somewhat of a pain so I hope I can avoid having to do that somehow. I also need to learn more about and implement proper Model-View-Controller separation. Freenet itself doesn’t really have this. :\ Perhaps I could use WoT and allow people to publish a list of keys to associate with an identity. The tool should make it easy to assemble and update these lists, and people could share keys to the lists (likely USKs?) outside of the application-offered channel (ex Freemail) in order to have secret lists.

Collectd and School

I’ve discovered collectd, which is a pretty comprehensive logging daemon. I’m using the ping plugin to attempt to get some data on when (and hopefully why) the Internet connection here in the apartments starts being worse than usual. The graphs aren’t incredibly clear, and it’s not as simple as lots of packet loss anymore – even so, TF2/Counter-Strike can be unplayable, especially later at night. I’m pinging the gateway on the other side of the DSL connection, as well as a few servers around the US. I wonder if the spikes later on are what’s causing the problem, but I don’t know if that’s due to more network load from us or in general.

My classes are interesting, but I have enough work that I feel like I’m always behind. I like political science though – questions such as what the role of government is in free speech, or what free speech is fascinate me.

Freenet and Linux Confusion

I’m working on putting together Node-to-Node Darknet chat for Freenet. I was originally working on it as a mainline Freenet feature, but then I realized that it was too large of a feature to be present outside of a plugin. The plugin interface seems, from what I’ve seen, disturbingly close to the internals of Freenet: much of the code I was writing when internal to Freenet is very easily translated to the plugin. I hope I’m either misjudging things or doing it wrong. For example, I need to have a listener for my chat messages, and to register it, I call the same method from the plugin as I would internally. To send a message, though I can no longer add code to DarknetPeerNode, I can still call the basic message-sending method. Perhaps these things aren’t likely to change, but I somehow expected a plugin interface based more on actions than the internal structure. I hope I can get it working anyway; this has been taking longer than I’d like. I’m trying to learn JQuery as well for a shiny interface.

I upgraded my graphics card driver (AMD Catalyst) in the hope that they’d fixed the bug introduced in 11.5 where the mouse cursor gets stuck and moves choppily in the lower right and upper left corners. They hadn’t, and in the process of installing the driver somehow my sound broke. It started working again this morning. It may have been that I re-added myself to pulse ~~and pulse-access~~ groups and needed to reboot for the changes to take effect, but if not I can’t think of why. EDIT: I think it’s working again now. The HDMI on my Radeon HD 5770 may be conflicting with the normal sound card now for some reason: it’s not listed in sound preferences and the sound is working; when it was the sound wasn’t. It’s worth noting that the “Use audio devices” privilege checkbox is apparently not supposed to be checked. This article is helpful as well. EDIT 2: This article ended up doing the trick to get surround sound output and front panel input working. What worked for me was appending options snd-hda-intel model=auto to /etc/modprobe.d/alsa-base.conf and running sudo alsa force-reload to apply the changes. This is running Ubuntu Natty 11.04.

I’m looking into self-hosted or open source alternatives to Dropbox. After looking at lipsync and sparkleshare, I tried dvcs-autosync as it seemed more lightweight and used XMPP (instead of IRC like sparkleshare) for communication. It didn’t seem to work – changes were committed one file at a time, which was annoying as it popped up a notification bubble for each one. Although one side would appear to log in to the XMPP server and send its status updates, the other side always ignored them. When using Jabber.org instead of a local server, it seemed like only one side could log on at a time. sparkleshare, as it uses git, is incapable of syncing git repos. I’m now trying lipsync, but it’s not working quite right yet. On the upside, it’s a small enough thing that it’s fairly easy to pick up.