More Freenet Stats

Freenet is decentralized, so while it’s (intended to be) a small-world network and thus short routes exist between any two nodes, it can be difficult to have a routing algorithm which can find those routes using only local information. As far as I understand it, Jon Kleinberg’s work (brief, paper) forms the basis of Freenet’s networking model. We don’t have local connections, as they’re expensive to form and in empirical testing actually detrimental to performance, but apparently the model still holds. A core finding from this work is that if connections are distributed so that more-distant connections are less likely, an implicit structure forms which allows forwarding a message to the closest peer at each hop to form a shortpath. As Freenet uses 1-dimensional locations, this distribution is based on a probability which is proportional to the inverse of the distance. The ideal distribution is logarithmic, but from what I’ve gathered, Freenet’s distribution isn’t too close to it. Making the actual match the ideal is difficult – the network size is a factor in the distribution, but it cannot (practically) be known. Techniques by Oskar Sandberg (paper) are intended to produce this distribution without such knowledge, but Freenet’s implementation seems to not behave as intended. I hope to help discover why, and how to fix it.

Edit May 2, 2012: I replaced a rather dubious ideal distribution plot made with a quick Python script with a much better-looking one made with a real simulator. Here are both distributions on the same plot:

Edit August 1, 2012: Corrected Y axis label to refer to the percent of links, not nodes.

Both plots on the same graph

Adventures in Python

I’ve been spending most of my waking hours with Python over break, and I really like the language. Unlike the standard C++ library schoolwork is limited to, in Python I can generally find a library to make my task a great deal simpler. I find assumptions that I make about syntax while figuring things out tend to hold and work as I would expect, and it’s incredibly convenient to pop open an interactive shell to try out an idea before dropping it into a larger program. I actually like the whitespace-sensitivity of Python due to the rudimentary level of organization, style, and readability it provides. It seems like there’s much less boilerplate code and syntax compared with something like C++. That said, it can be odd to have it be an open question what type something is or what attributes it has and have that lead to problems. It can be frustrating to change code and not know if the types are right until that part runs. These are problems which would not be present in a statically typed language, but such a language will probably not be so flexible.

I’ve managed to get one project into a state in which I’m willing to show it the light of day: RelayBot. Not finding a working IRC bridge bot, I worked off an existing (but for me non-functional) implementation which heavily informed its design. I built my version by removing parts until it connected properly, then writing more functionality and removing still more until it did what I had in mind. I hope to use it to bridge a channel on FLIP and Irc2P.

The project (again in Python) which is not yet ready is a network probing and analysis application. It collects network topology information (optionally in a threaded fashion) and commits the results to a sqlite database for later analysis. It’s hoped that this will allow evanbd to replace a collection of Bash scripts which take an incredibly long time to run and are prone to breaking. The basic functionality is there, but it has many rough edges still. I’m partial to the peer distribution graph:

Histogram of Number of Nodes vs Claimed Number of Peers

GNUPlot really does give lovely images. What I find interesting about this is how there are clear peaks – many nodes claim 12 or 36 peers, which seems very likely to be a function of the peer connection caps and bandwidth limits. There were some outliers, with one node claiming 92 peers! What’s encouraging is that this overall pattern seemed quite stable even as many more probes were collected.

This project has made clear to me how much I need to learn SQL properly. I initially wrote a collection of three queries to generate this: one query retrieved keys which were used to iterate over the other two. Generating this graph took about two hours. I figured out how to rewrite it to use the proper SQL commands for getting the result, and the exact same graph generated in approximately 30 seconds! What’s more, there’s a command I’d like to write that I don’t know how: “Take the sum of the count of the distinct traceNums for each probeID.” It sounds so SQL-y I’m not sure quite how I haven’t been able to do so.

It’s been a fun break, and a shame it couldn’t last longer. Learning in this kind of an organic way with immediate results and self-demonstrating practicality is fantastic.

Hard Drives

It’s not a question of if, but when. Hard drive failure has been a large part of my life recently: this server, my desktop, my roommate’s laptop – and all I can easily do is keep an eye on smartctl. More realistically, I should likely configure smartd to do it for me, but that’s for another day. Resizing an encrypted partition is rather…. manual. Apparently the way to extend a partition in fdisk is to delete it and recreate one with the same type and starting position but farther endpoint. Nerve-wracking. At least I have yet to lose data to hard drive failure. Part of it’s being careful – backups; checking drive health – and part of it’s luck. My roommate’s laptop hard drive died completely and without warning. Storage is fragile.

Fun with Linux

I removed my dad’s Linux installation; it was more than two years old and he wasn’t using it, so it just took up half his hard drive as that’s how we had partitioned it. Getting rid of GRUB was the first step, so I booted into the XP recovery console from the installation disc. I was prompted for an Administrator password, but it turned out to be blank, so I just hit enter. Woo, security. I didn’t run bootcfg /rescan or fixboot; fixmbr alone was enough to do it. It successfully booted with the Windows bootloader upon restarting, so I used the XP partition manager to remove the Linux partitions. I couldn’t seem to remove the extended partition, nor resize the volume, which made sense because the extended partition was there.  I booted up into System Rescue CD, fdisk’d away the extended partition, then fell back to GParted to expand the single remaining partiton and its filesystem to fill the drive. Presto: double the free space available to Windows!

I also decided to try to dual-boot Debian Squeeze and Debian Wheezy on my netbook. This is because in Physics 260 we use python-visual in our computer homework, and the version in Squeeze has a problem that results in simple renderings containing, for instance, nothing but a sphere and a box taking seconds per frame. The error message i915_program_error: Exceeded max instructions is also emitted. I used the Debian installer’s guided full-disk encryption to set up this machine, so I have an ext2 partition mounted as /boot, then logical volumes for /home, /, and swap within an encrypted LVM. I wasn’t sure if two installations of Debian sharing a /boot partition was a good idea, but I assumed it wasn’t and so halved the existing one and added another ext2 partition for the new installation. I wonder if ext3 is a more dependable choice. Then I had to make another logical volume in the encrypted volume group for use as / for the Debian Wheezy installation. After poking around online to get an idea of what to do, I booted into a LiveUSB, started with cryptsetup luksOpen to open the encrypted container. Then vgscan to find volume groups, and vgchange -a y to make the logical volumes available. LVM is an alternative to partitions, so I then shrank my /home with resize2fs, then shrank the logical volume around it with lvreduce. It was a little scary when resize2fs and lvreduce appeared to treat units differently, but it seems to have been fine. If my understanding is correct, resize2fs reports size in 4kiB blocks (which it prints as 4k), and lvreduce speaks of base 10 units, yet seems to mean base 2. lvcreate was the easy part.

Amazingly, my system still booted after all this, so I installed Debian Wheezy. It took some fiddling to get partman to recognize the contents of my logical volumes. I had to trigger loading cryptsetup by going into the encryption setup, (IIRC pressing finish, same for LVM) then used cryptsetup, vgscan, and vgchange as before, then going out of and back into partman. The bootloader failed to install, but I continued without it and ran update-grub once back in Squeeze and although it detected Wheezy in the LVM, the entry it generated wouldn’t boot because it fails to prompt for the crypt container’s passphrase. I’m not sure why; that’s about as far as I’ve gotten. I tried without a separate boot partition and as would be expected it couldn’t even find the kernel. There are wishlist bugs filed in Debian about the installer’s support for encrypted LVM: #498199, #529343, and #566497 to name a few. I hope I can figure this out, but I don’t feel comfortable spending a great deal of time on it. I may just install to my flash drive.

Edit Jan 11, 2012:
I got it working! It wasn’t prompting for the passphrase because it was missing /etc/crypttab. Once I added that and chrooted in from the installer’s rescue mode to generate a new initramfs with update-initramfs -u, it worked! Hooray!

Umich AFS

The University of Michigan does not offer support for mounting ITS AFS space from non-university machines. I was able to set it up after a bit, and thought I’d document it here in the hopes that it’s helpful. For Ubuntu, the packages are openafs-client, krb5-user, and openafs-krb5. The realm name is UMICH.EDU – case matters. Next, as the official docs say: run kinit <username>, (assuming your local username differs from your uniqname) which will prompt for your password, followed by aklog. You should now be authenticated as yourself with your home directory at /afs/umich.edu/user/<username first letter>/<username second letter>/<username>/.

Stuff

It seems clear in retrospect that my Windows installation died due to a failing hard drive; I RMA’d it after the self-test reported read failures. I hope to build a RAID array at some point, but it’s somewhat out of my price range currently to buy a couple of drives.

College costs too much – when I talk to people who went to college in the 70s, they tell me how they were able to pay for college with a part-time job. What happened?!

I’m trying to figure out if over the summer (hopefully as part of GSOC!) I could write a filesharing application for Freenet – one that would make Freenet about as easy as Vuze to find and download stuff. User interface would be incredibly important in this effort, as features are not sufficient for  – and too often impede – intuitiveness. My target audience is people who aren’t computer-savvy. I’d want to use a lot of jQuery to make it more like a desktop application. When I made my chat plugin using Toadlets it was somewhat of a pain so I hope I can avoid having to do that somehow. I also need to learn more about and implement proper Model-View-Controller separation. Freenet itself doesn’t really have this. :\ Perhaps I could use WoT and allow people to publish a list of keys to associate with an identity. The tool should make it easy to assemble and update these lists, and people could share keys to the lists (likely USKs?) outside of the application-offered channel (ex Freemail) in order to have secret lists.

Collectd and School

I’ve discovered collectd, which is a pretty comprehensive logging daemon. I’m using the ping plugin to attempt to get some data on when (and hopefully why) the Internet connection here in the apartments starts being worse than usual. The graphs aren’t incredibly clear, and it’s not as simple as lots of packet loss anymore – even so, TF2/Counter-Strike can be unplayable, especially later at night. I’m pinging the gateway on the other side of the DSL connection, as well as a few servers around the US. I wonder if the spikes later on are what’s causing the problem, but I don’t know if that’s due to more network load from us or in general.

My classes are interesting, but I have enough work that I feel like I’m always behind. I like political science though – questions such as what the role of government is in free speech, or what free speech is fascinate me.

Freenet and Linux Confusion

I’m working on putting together Node-to-Node Darknet chat for Freenet. I was originally working on it as a mainline Freenet feature, but then I realized that it was too large of a feature to be present outside of a plugin. The plugin interface seems, from what I’ve seen, disturbingly close to the internals of Freenet: much of the code I was writing when internal to Freenet is very easily translated to the plugin. I hope I’m either misjudging things or doing it wrong. For example, I need to have a listener for my chat messages, and to register it, I call the same method from the plugin as I would internally. To send a message, though I can no longer add code to DarknetPeerNode, I can still call the basic message-sending method. Perhaps these things aren’t likely to change, but I somehow expected a plugin interface based more on actions than the internal structure. I hope I can get it working anyway; this has been taking longer than I’d like. I’m trying to learn JQuery as well for a shiny interface.

I upgraded my graphics card driver (AMD Catalyst) in the hope that they’d fixed the bug introduced in 11.5 where the mouse cursor gets stuck and moves choppily in the lower right and upper left corners. They hadn’t, and in the process of installing the driver somehow my sound broke. It started working again this morning. It may have been that I re-added myself to pulse and pulse-access groups and needed to reboot for the changes to take effect, but if not I can’t think of why. EDIT: I think it’s working again now. The HDMI on my Radeon HD 5770 may be conflicting with the normal sound card now for some reason: it’s not listed in sound preferences and the sound is working; when it was the sound wasn’t. It’s worth noting that the “Use audio devices” privilege checkbox is apparently not supposed to be checked. This article is helpful as well. EDIT 2: This article ended up doing the trick to get surround sound output and front panel input working. What worked for me was appending options snd-hda-intel model=auto to /etc/modprobe.d/alsa-base.conf and running sudo alsa force-reload to apply the changes. This is running Ubuntu Natty 11.04.

I’m looking into self-hosted or open source alternatives to Dropbox. After looking at lipsync and sparkleshare, I tried dvcs-autosync as it seemed more lightweight and used XMPP (instead of IRC like sparkleshare) for communication. It didn’t seem to work – changes were committed one file at a time, which was annoying as it popped up a notification bubble for each one. Although one side would appear to log in to the XMPP server and send its status updates, the other side always ignored them. When using Jabber.org instead of a local server, it seemed like only one side could log on at a time. sparkleshare, as it uses git, is incapable of syncing git repos. I’m now trying lipsync, but it’s not working quite right yet. On the upside, it’s a small enough thing that it’s fairly easy to pick up.

Windows Reinstall, IDEA vs Eclipse, Ubuntu Natty

Soon after installing SP 1, it looked like my Windows 7 installation was hanging on boot. A few impatient hard reboots later I had a consistent BSOD. (0x0000007B (0xFFFFF880009A9928, 0xFFFFFFFFC0000034, 0x00000000000000000 0x0000000000000000)) Startup repair didn’t help and neither did any utilities I could think of, whether available from startup repair or not. Safe mode wouldn’t boot and would hang on classpnp.sys. I tried fixmbr, and then realized that restoring GRUB when running with an encrypted root partition is more complicated. I should have just backed up the bootloader to an image, but I didn’t. Luckily, someone more knowledgeable than I had the same problem (though not for as foolish a reason) and I found their guide. sfc /scannow didn’t fix it. Some forums suggested deleting classpnp.sys, which didn’t help; it just started rebooting without displaying a message. Copying the classpnp.sys from a machine also running Home Premium 64-bit just restored it to the BSOD. I ended up reinstalling Windows 7. I moved my files to my Linux backup partition and back while booted into Linux, and in retrospect given how much fragmentation it caused I should have installed over it and just copied from the Windows.old directory on the new installation. I assume that the Linux NTFS drivers are to blame for the high levels of fragmentation, and that Windows would handle it far better.

While Eclipse works, it is rather sluggish and the UI is clunky. Thanks to ##java on Freenode, I’m now using IDEA. It’s somewhat different, and it doesn’t have the ANT error-checking that Eclipse has, but I prefer it. The interface is less cluttered and it feels easier to devote more of my screen to the actual source code. I think the reason is as simple that the buttons to hide windows are larger, toggle, and don’t move. The ANT problem I ran into took a while to figure out. IDEA was nice enough to offer to pull code down from GitHub, but although the resulting code compiled, the .jars didn’t run correctly. Being reduced to diffing the directories, I discovered that IDEA didn’t catch the ANT problems that Eclipse did, and upon fixing those it worked.

I’ve upgraded to Ubuntu Natty, and I don’t like the new interface. I don’t think it works very well. For instance, it’s very easy to remove an item from the dock, but dragging an item to change the order takes too long. This is probably due to the necessity of differentiating between scrolling the contents and dragging a single item. It’s not intuitive. Similarly, adding an icon to the bar is as easy as dragging, but it only stays in the bar if the thing that was dragged there remains. I’m glad GDM presents the Ubuntu Classic option. My desktop’s upgrade process to Natty was very much unlike the three other machines I’ve upgraded uneventfully. Not only did GRUB install unsuccessfully, which resulted in symbol not found : 'grub_env_export', (following the above guide again fixed it) but something about my motherboard led to a very long hang on boot after NET: Registered protocol family 1. At least there are bugs filed for the former and the latter, but I’m glad I wasn’t also hit by this one. Perhaps this will lead to me reading the release notes before an upgrade. However, I don’t see mention of these issues in the release notes, which worries me that it won’t be enough.

Problem Types

I find that among my least favorite types of problems are those that I’m unable to learn from. My main system drive was spontaneously remounted read-only, and upon Alt-Sysreq-reisub’ing, the OS didn’t come up and I got “error: partition not found” and a grub rescue> prompt that couldn’t do anything; not even “help.” I pushed in all the SATA cables and it came up, but upon reboot the drives were out of correct boot order. Bizarre. The part about this that scares me is that I’m for the most part unable to learn anything from this, and I wasn’t able to do anything to stop it from happening again because I don’t know why it happened. The same thing applies to the mysterious times this machine goes completely unresponsive while idle or suddenly doesn’t have video on boot, then spontaneously regains it. The former has happened a few times, the latter only one.

From programming in assembly, I finally realize how segmentation faults are really nice compared to the alternative. Data and instruction separation is a luxury. Miss a bounds check and suddenly you’re executing things not intended to be instructions and you get really weird opcodes and the whole thing dies. It can get really frustrating.

I realized the only reasons my server has gone down at dad’s are due to external forces: either the power has gone out at the power outage or circuit breaker level, or cables have been unplugged by unwitting family members. I wonder how much better colos are. ChunkHost was really nice, and I’d have likely continued once my “free beta” ended (I have my suspicions it’s a marketing thing for “free trial”) if I had more disposable income to the point where I felt I could justify a monthly fee.

EDIT: I had forgotten the time it went down as I was upgrading from Debian Lenny to Squeeze. I had set up a virtual machine for fallback, but I didn’t end up using it: in restoring the VM from backup I unknowingly uncovered a configuration problem with one of the hosted sites that showed up a few days later on the main server. Whoops.