After Ars Technica ran this article on a Stable Diffusion mobile app, it seemed like a good time to give Stable Diffusion another shot. I had previously given up figuring out how to set up the desktop version. It’s polished! It includes example prompts to demonstrate what sorts of incantations make up a good prompt, and with that and a moderate wait for it to generate I had this:
That was enough to hook me, so when I noticed that article also linked to stable-diffusion-webui, it became a great time to see what the same underlying image generation can do when it can draw ~300W continuously on my desktop instead of being limited to a phone’s resources. I quickly (and somewhat inadvertently) was able to generate a cat fractal:
This was my introduction to the sorts of artifacts I could expect to do battle with. Then I had an idea of how to use its ability to modify existing photos. After some finagling, I had a prompt ready, set it to replace the view outside the window, and left it running. When set to a very high level of detail and output resolution, it generated 92 images over about 6 hours. Of those, 22 seemed pretty good. Here is a comparison of my 2 favorites:
And a slightly different prompt where I had selected part of the window other than the glass:
There were three sorts of artifacts or undesirable outputs that were frequent:
It focused too much on the “circular” part describing a hobbit hole door.
It added unsettling Hobbit-cryptids that brought to mind Loab.
The way I used the built-in inpainting to replace the outside meant both that the image was awkwardly separated into 3 largely independent areas, and those images often tried to merge with the area around the window. This makes total sense for its actual use case of editing an image, but I’d tried to configure it to ignore the existing image without success. In retrospect, I could have used the mask I made to manually put the images behind the window with conventional editing software.
It’s wild to use image generation to generate this without being able to even imagine painting it myself. It’s definitely not the system doing all the work – you have to come up with the right kind of prompt, adjust generation parameters, and curate the results. But it’s a lot cheaper and easier than art school and practice, which I feel uncomfortable about because this model was trained in part and without permission on art from those who did go to art school.
I had figured it wouldn’t happen once Ghost Ship Games said they weren’t going to tackle it, but it turns out it’s viable to mod in! It’s in open beta currently, and there are rough edges throughout the experience, but the core of it – being in a huge cave and fighting bugs – is just as incredible as I had hoped it might be. It lets VR do what it’s good at – intensify already-good experiences. A far cry from the stereoscopic screenshots NVIDIA’s Ansel provided! I’m very pleased.
Here’s the problem – my neighbor’s HVAC closet is doing this, and it’s preventing me from sleeping:
I’ve submitted a maintenance request, sure, but more data more better, right? There are three states here: not buzzing, buzzing, and buzzing without the higher component. 55 Hz primary, 220 Hz higher component. FFTs away!
I was hoping this would be the part where I link to the code, but boy howdy is it harder than I anticipated to get a microcontroller to do this.
Not for computers – for other electronics. I’d glossed over the “it can supply 600mA peak” part for my microcontroller board’s 3.3V regulator, and assumed I wouldn’t hit it. Then I spent a great deal of time trying to diagnose strange nondeterministic behavior. The OLED display would quickly go blank. The SCD40 CO2 sensor would quickly stop providing data. These problems all immediately stopped when I used an external power supply capable of higher amperage. Go figure.
It turns out that it’s pretty dang gratifying to build embedded systems. Don’t like the power LED that’s always on? It’s your code that turns it on! You can turn it off!
I have three projects that are functional so far. One logs temperature to a MicroSD card, which helped convince the leasing office that my fridge was not cooling well enough to be food safe. Another displays CO2 sensor readings. The third sends door close/open events over MQTT.
Currently I’m working on giving a kitchen scale the MQTT treatment so I can make detailed cat food consumption graphs over time instead of manually weighing for two inconsistently timed data points per day. Next will probably be designing and printing a case for the CO2 thing, which is currently just a bunch of components taped to a power bank, and is very flimsy.
I moved my computer, and with a new setup comes new hazards. My headphone cable was wrapped around my foot when I tried to get up, and it ripped the headphones out of the jack. Both the jack and the plug were damaged – the jack was no longer in one piece, and pieces of plastic had flown off its housing, and the plug was bent but still in one piece. We were able to bend the plug back into shape with a vice, but the damaged jack would only provide audio to the left ear. Enter jack retasking in the Realtek Audio Console:
I was able to use the remaining front panel jack, and made sure to run the headphone cable with less slack to try to prevent a repeat performance. There’s a bug here, too: even though I’ve requested separate playback devices for the front and rear panel, it only gives me two if the normally-output jack is set to output. Close enough, and if I really wanted to work around it I could probably rewire the remaining jack to appear as the output one by moving cables in the front panel connector.
For the 8th grade talent show at the end of the year, I made a Flash animation to Lemon Demon‘s Dance Like An Idiot. As you may be aware, Flash hasn’t aged super well, so I tried making a video of it at modern resolutions. It now has a height of 1080 pixels, compared with its original mid-2000s 548×400, but is still 12 FPS. I noticed some problems such as duplicated frames, but deemed it good enough for a first published attempt.
This video rendition is over 25x the size of the original SWF at around 33 MiB.