Monday, March 31, 2014

Homebrew USB Sound Card

This project began out of necessity as I was working on a Raspberry Pi project a that required low latency audio output. However I was having major issues with latency from the the Pi's sound card, observing at least 20-30ms of latency. This would not do, I needed real time responsiveness, less than 5ms. With the help of CNLohr I began to try and build my own low latency USB sound card.

Here is the result...

CNLohr designed the PCB and I programmed the firmware. The sound card uses a MAX5556 digital to analog converter (DAC), a Mega8u2, and an Attiny44. I made a few modifications to the PCB to make it processing audio data more efficient and to fix a design flaw.

An attiny44 is used to drive the MAX5556 DAC using the 3 wire I²S interface. The attiny is running at 24.5Mhz which, with some clever use of the microprocessor, can feed the DAC with 48khz 16bit stereo data. I use the tiny's hardware timer to toggle the PA7 pin every 256 clock ticks, which drives the DAC's left/right clock. An assembly loop is used to toggle the DAC's serial clock and sdata pin. This loop is synchronized with the left/right clock (PA7). An assembly interrupt is used to copy data out of the attiny's SPI registers into memory. In an effort to minimize the time the SPI interrupt takes, I had to minimize the the amount of work done. Within the assembly, I carefully selected registers so that I could avoid having to push or pop registers in the event of an SPI interrupt.

The AT Mega8u2 is used to communicate with the USB host and the AT tiny. The Mega8u32 is clocked at 16Mhz, a limitation imposed by USB. Audio is streamed into a circular buffer from the USB host. The circular buffer is emptied over the SPI interface to the AT Tiny microprocessor. Assembly is used to send the data over SPI to make use of the avr's store and increment assembly instruction, saving many clock cycles. The Mega8u32 and the AT tiny are kept in sync by using an interrupt on the PC7 pin. This pin is, like the DAC's left/right clock, connected to the tiny's PA7 pin. A rising edge on the PC7 pin signals that it is time to begins ending new data to the at tiny.

A quickly hacked together topside program sends raw stereo data to the sound card. The internal buffer of the sound card is about 64 samples per channel, about 1.3ms. USB double buffering provides an additional ~.6ms of buffer. The sound card works pretty well from a PC. Occasionally the sound card will experience an inaudible buffer underflow. A red LED will flicker on the sound card when this is detected. Using this sound card on the Raspberry Pi is a completely different story. The Pi isn't capable of servicing the USB fast enough to to keep up. Buffer underflow occurs constantly, completely useless.

The source code is available at https://github.com/axlecrusher/AvrProjects/tree/master/soundCard

Slightly outdated schematics can be found here. https://svn.cnlohr.net/pubsvn/electrical/avr_soundcard/ It is lacking the modifications visible in my photos.

Edit:Added note about double buffered USB.

Saturday, March 1, 2014

Mdadm External Write-intent Bitmap Hack (Debian update)

I recently upgraded to Debian 7.4 and found that I needed to redo my external write intent bitmap hack. My old methods no longer worked.

First I needed to disable mdadm assembly when running from the ramdisk. Edit /etc/default/mdadm and set the following:
INITRDSTART='none'

Rebuild the ramdisk image with:
update-initramfs -u

You still need to prevent mdadm from assembling the array listed in /etc/mdadm/mdadm.conf so that
DEVICE /dev/null

I created a new mdadm config file names /etc/mdadm/mdadm.delayed.conf which is a copy of /etc/mdadm/mdadm.conf but leaving
DEVICE partitions
as is. I also specify the bitmap file on the ARRAY definition line
ARRAY /dev/md/127 metadata=0.90 bitmap=/md127-raid5-bitmap UUID=....

Next I created a new script /etc/init.d/delayedRaid
#!/bin/sh # # Start all arrays specified in the delayed configuration file. # # Copyright © 2014 Joshua Allen # Distributable under the terms of the GNU GPL version 2. # ### BEGIN INIT INFO # Provides: delayedRaid # Required-Start: $local_fs mdadm-raid # Should-Start: # X-Start-Before: # Required-Stop: # Should-Stop: $local_fs mdadm-raid # X-Stop-After: # Default-Start: S # Default-Stop: 0 6 # Short-Description: Delayed MD array assembly # Description: This script assembles delayes assembly of MD raid # devices. Useful for raid devices that use external # write intent bitmaps. # Settings are in /etc/mdadm/mdadm.delayed.conf ### END INIT INFO . /lib/lsb/init-functions do_start() { log_action_begin_msg "Starting delayed raid" mdadm --assemble --scan --config=/etc/mdadm/mdadm.delayed.conf log_action_end_msg $? mount /mnt/raid5 } do_stop() { umount /mnt/raid5 mdadm --stop --scan --config=/etc/mdadm/mdadm.delayed.conf } case "$1" in start) do_start ;; restart|reload|force-reload) echo "Error: argument '$1' not supported" >&2 exit 3 ;; stop) # No-op do_stop ;; *) echo "Usage: delayedRaid [start|stop]" >&2 exit 3 ;; esac

And I added to the start-up procedures with
insserv -d delayedRaid

After reboot check to see if
Intent Bitmap : {some file name}
is present when running
mdadm --detail /your/raid/device

Hopefully I didn't miss anything.

Friday, January 25, 2013

Install djbdns on Raspberry Pi

Install djbdns on Raspberry Pi

djbdns is a small, fast, and secure DNS server. Perfect for low resource systems. I also find it easier to configure than BIND (once you understand how).
I start with a raspbian image from http://www.raspberrypi.org/downloads

Install some packages that D. J. Bernstein says that we need.
apt-get install ucspi-tcp apt-get install daemontools

Don't install tinydns. It includes a pop3 server.
Install djbdns following http://cr.yp.to/djbdns/install.html

Create some users and groups that we will need for executing the dnscache and multilog.
useradd svclog useradd dnscache

Create the /etc/dnscache folder structure
dnscache-conf dnscache svclog /etc/dnscache

Setup /service directory, svscan looks at this directory to see which services to run.
mkdir /service ln -s /etc/dnscache /service/dnscache

Add the following to /etc/rc.local so that the supervised services start on boot.
/usr/bin/svscanboot &

svscanboot also needs the following link to function correctly.
ln -s /service/ /etc/service

Optional Things

Update /etc/dnscache/env/IP to contain the ip address to listen on. Also create a file entries in /etc/dnscache/root/ip to specify the networks that the dns server should reply to.

Edit /etc/dnscache/log/run adding s52428800 before ./main to set the log size to 50MB.
It should look something like
exec setuidgid svclog multilog t s52428800 ./main

You should update the root server list
wget http://www.internic.net/zones/named.root -O - | grep ' A ' | tr -s ' ' | cut -d ' ' -f4 > /etc/dnscache/root/servers/\@

Update /etc/resolv.conf to use your new dns server.

Change UDP packet size to accommodate big UDP packets. Many DNS servers require large UDP packets or djbdnscache will fail with drop # input / output errors. https://dev.openwrt.org/browser/packages/net/djbdns/patches/060-dnscache-big-udp-packets.patch

Resources

http://cr.yp.to/djbdns/dnscache.html
http://cr.yp.to/daemontools/multilog.html
http://cr.yp.to/daemontools/supervise.html
http://tinydns.org/

Saturday, September 22, 2012

mdadm external bitmap boot hack

Update 3/1/2014: I have updated my procedure for Debian 7.4.

Linux's mdadm gives you the ability to use a write intent bitmap, which helps speed resync if a drive fails and is then re-added. There are two bitmap storage options, internal and external. The internal storage stores the bitmap on the raid array disks and can slow writing significantly. External storage uses a file on an ext2 or ext3 file system. It can be quicker than internal storage during writing but causes big problems during boot.

In order for the bitmap to function, it must be writable at the time that mdadm assembles the array. If it is not writable, mdadm will fail to start the array. At the time that mdadm assembles the array, typically no partitions are writable.  My solution to this was to shift the array assembly to some time period after mountall (mounts the file systems found in fstab) had executed.

First I prevent any raid partitions from mounting during boot, noauto in fstab.

I copied /etc/mdadm/mdadm.conf to /etc/mdadm/mdadm.manual.conf.  Then I edited /etc/mdadm/mdadm.conf changing DEVICE partitions to DEVICE /dev/null. This will make mdadm scan for raid partitions within the null device, in which it will find none. Raid devices will no longer assemble during the boot process.

Then I created a new script in /etc/rcS.d/ named S02mountRaid (don't forget execute permissions). The script contains the following lines.
mdadm --assemble --scan --config=/etc/mdadm/mdadm.manual.conf mount /mnt/raid5
This will cause mdadm to scan the copy of our mdadm.conf which we did not modify. Mdadm will correctly assemble the raid devices found in that file along with assigning the external bitmap to the proper array. The raid device is then mounted.

This script runs at every run level, and runs before any other init.d script. Future work, I need to find a way to make init.d wait for script completion.

Friday, May 4, 2012

DRAM Testing

For the past few months I have been working on restoring an old Commodore 64 that someone gave to me. It was missing a few obvious pieces and after making a new AV cable and obtaining a new power supply I found that it wouldn't boot properly. So far I have replace the PLA, VIC, capacitors, and a few other components. But this post is not about those things. This is about testing the Commodore 64's DRAM chips.

My particular C64 uses 8 individual RAM chips most are D4164C-15 and a couple are D4164C-2. I replaced all the RAM but I wanted to know if both the original and replacement chips were functioning properly. I decided to test each chip, I needed some hardware to test with.

I decided to use a MEGA8U2 AVR microprocessor, mainly because I have one that can plug into a breadboard and it has enough pins to drive the D4164C chips.

I wired the test setup to try to reduce the instruction count where I could so it is a little messy. Wiring and instruction count could be greatly improved if I didn't need the programming header, but it gets the job done.



At this point I realized I had no idea how to control DRAM. My friend CNLohr gave me a quick explanation of how DRAM works (basically you have to refresh it within time period and this ram appears to be 256 rows by 256 columns). Ok...

The first thing to do was to try to write and read just one bit.  The datasheet for the DRAM provided timing windows charts for each step required when perform all of the operations the memory is able to do. After a few hours of stepping through the charts, coding, re-coding, reviewing the charts, and sometimes just trial and error, I finally was able to write and read 1 bit from memory.  After a couple more days of work I was reading and writing to the entire memory module. (I forget exactly what made this take so long to accomplish. Some kind of bug in my program.)

I constructed a couple of routines to test both the wiring and the memory. The tests are largely based off of information found at http://www.ganssle.com/testingram.htm. There's a lot of good information there for developing ram tests.

To test the wires I wrote 0 to the first bit of the memory followed by a 1 to a power of 2 memory location (high on just 1 wire). I then read memory location zero and if the value is no longer 0, it indicates a failure on a specific address wire.

I used a walking one algorithm with a bit inversion to test all the memory cells. The goal is to toggle as many bits as possible.

In either case if there's an error, the red LED would turn off forever. While the test is running, the LED will blink at the end of each complete cycle.

I was able to test all the memory modules I had replaced. They were all functioning properly.

Update 5/5/2012:
The source code can be found at https://github.com/axlecrusher/AvrProjects/tree/master/avr_dramTest

Saturday, October 15, 2011

Recovering RAID5 from Multiple Simultaneous Drive Failures

RAID5 is a redundant disk system that protects against a single drive failure. The array can keep functioning allowing you the defective disk and rebuild the raid without any data loss. However if more than one disk fails at a time, RAID5 will not help you (there are other raid levels that can). Sudden multiple disk failure is exactly what happened to my system one night.

Edit (9/15/2012): I added additional information at the bottom of this post which makes the re-assembly process easier. I recommend it rather than the --create procedure detailed below. It is still a good idea to read the entire post though.

Once or twice in the past I have had a single drive fail because of a lose or fault SATA cable. This is easily resolved by powering down the computer and re-securing the cable. I usually notice a drive failure within a week (I should setup and alert system). But recently, I had two drives fail within two hours of each other. I hadn't even noticed the first drive failure before the second had drive failed. Rebooting the computer cleared up the SATA errors that brought the drives down. The drives seemed to be function properly, they hadn't suffered a hardware failure. However, the raid could not rebuild itself because linux had marked both drives as faulty. At this point in time I had 6TB of data at risk, with partial backups several months old. I was mostly worried about photos that I had taken over the past several months that can't be replaced.

So what to do... try not to panic, this is going to get messy.

I began by trying to figure out which drives failed and in which order by issuing mdadm --examine for every device in the array. I focused on the last portion of the output, which contains the status of each device. The data is recorded independently on each device in the raid, so you can compare the output and find differences. In a properly functioning raid the output should be identical for each device. Below is the output for the /dev/sda1 device.

Number Major Minor RaidDevice State
this 3 8 81 3 active sync /dev/sdf1

0 0 8 17 0 active sync /dev/sdb1
1 1 0 0 1 faulty removed
2 2 0 0 2 faulty removed
3 3 8 81 3 active sync /dev/sdf1
4 4 8 97 4 active sync /dev/sdg1


Knowing that I lost 2 drives, I figured that this drive had not failed simply because the failures were recorded on this disk.

Continuing, I eventually found the second drive that failed because it only had a record of one drive failure. Thus drive must have been functioning during the first failure but not the 2nd.
Number Major Minor RaidDevice State
this 1 8 33 1 active sync /dev/sdc1

0 0 8 17 0 active sync /dev/sdb1
1 1 8 33 1 active sync /dev/sdc1
2 2 0 0 2 faulty removed
3 3 8 81 3 active sync /dev/sdf1
4 4 8 97 4 active sync /dev/sdg1


Then I found the first drive that failed because there was no failure recorded on it at all.

Number Major Minor RaidDevice State
this 2 8 49 2 active sync /dev/sdd1

0 0 8 17 0 active sync /dev/sdb1
1 1 8 33 1 active sync /dev/sdc1
2 2 8 49 2 active sync /dev/sdd1
3 3 8 81 3 active sync /dev/sdf1
4 4 8 97 4 active sync /dev/sdg1


Note: I think you can also use the "Update Time" from mdadm --examine to figure this information out. I used it to verify that my logic was correct.

Important Note: My computer likes to move SATA devices around at boot time. So the device names listed in the raid status outputs were not accurate after rebooting. The above mdadm output says /dev/sdd1 had failed, but the device name I queried mdadm for was /dev/sdf1. You MUST match the current device with the array device number. The correct device order is essential for fixing the raid.

Now that I knew the devices and the order in which they failed I could do a little more thinking.

I figured since linux halted the file system and stopped the raid when the 2nd device failed there shouldn't be too much data corruption. Probably only the data that was being written to disk near the time of failure. This data wasn't too important. The last important data had been written a few days earlier so t should have been flushed from the caches to disk. With these assumptions I decided that it should be possible to just tell the array that only the 1st failed drive is broken and that the second is OK. Apparently you can't really do this. The only way to do it is to destroy the array and rebuild it.

Thats right, destroy and then rebuild the array. Pray for all the datas.

I googled around for a while to try to see if my idea was possible, it seemed to be. The best validation for my idea came from this blog.

The rebuilding process...

The first step was to stop the raid device mdadm --stop before I could start destroying and re-creating it. If you don't do this, you get strange errors from mdadm saying it can't write to the devices. It was aggravating to figure out why so just do it.

I decided I needed to protect myself from myself and possibly from mdadm. I wanted to make sure there was no chance that I would accidently rebuild the array using the first failed (most out of sync) drive. I zeroed the device's raid superblock. mdadm --zero-superblock /dev/sdf1. Now it is no longer associated with any raid device.

Next I used the output from the mdadm --examine commands to help me construct the command to rebuild the raid.
mdadm --verbose --create --metadata=0.90 /dev/md0 --chunk=128 --level=5 --raid-devices=5 /dev/sdd1 /dev/sde1 missing /dev/sda1 /dev/sdb1


IMPORTANT: Notice that the device order is not the same order as listed in the mdadm --examine output. This is because my computer moves the SATA devices around. It is CRITICAL that you rebuild array with the devices in the proper order. Use the array device number for "this" device from the output of the mdadm --examine commands to help you order the devices correctly.

I specified the chunk size using the value from mdadm --examine. I found I also had to specify the meta data version. Mdadm by default used a newer meta data version, which altered the amount of space on each device. The space used for the rebuild needed to be exactly the same as the original array setup otherwise the data stripes won't line up (and your data will be munged). You can rebuild the array as many times as you like so long as you don't write data to the broken array setup. I rebuilt my array 3 or 4 times before I got it right.

To check if the array setup was correct I ran e2fsck -B 4096 -n /dev/md0 (linux scan disk utility). I decided it was safer to specify the file system block size to make sure e2fsck got it right. Since I am just testing the array setup, I didn't want e2fsck making any changes to the disk hence the -n. If the array setup is incorrect the striped data won't line up and e2fsck won't find any superblock and will refuse to scan. If e2fsck is able to preform a scan, then the array setup must be OK (at least that's the hope).

The next part is probably the most dangerous part because at this point you are editing the file system data on the disks.

Once I was sure the array setup was correct I ran e2fsck -B 4096 /dev/md0 to fix all the file system errors. There were thousands of group errors, a few dozen inode errors, and a lot of bitmap errors. The wait was nerve racking but eventually it finished. I was able to mount the file system as read only (for safety), and list the files, I was even able to open a picture.

Lastly I added the first failed drive back into the array mdadm -a /dev/md0 /dev/sdf1 and the array began rebuilding the parity bits.

At this point I began dumping all important data to another drive just to have a current backup. Once the parity bits have been rebuilt I will remount the partition as read-write.

That's it, RAID5 recovered from multiple drive failure.

Edit (9/15/2012): I found another page that has helpful advice. http://zackreed.me/articles/50-recovery-from-a-multiple-disk-failure-with-mdadm  Instead of using --create, you can use --assemble --force with the drives that you want mark as clean and use.   This will assemble the array in degraded mode with the devices in the correct order.  You can then zero the super block of the 1st failed drive and then --add it to the array.

Friday, July 15, 2011

Tor: An Experiment in Anonymity

I first learned about Tor while reading about darknets and it has been in the back of my mind ever since. Every once in a while I would read something that mentioned Tor. Eventually my interest had been piqued enough that I wanted to know more about Tor.

Enter the Tor Project

The Tor Project is a program that joins your computer to a network of other computers in which data is encrypted and then routed through relays until it reaches its destination. Routing paths are chosen at random and periodically change. Nearly any type of internet traffic can be routed through Tor. Tor adds anonymity to internet traffic because once traffic is encrypted and enters the network, it is impossible to determine its source. (This is false if an attacker is capable of watching both the source and destination networks.)

Tor was originally the Onion Routing program, a program developed by the U.S. Navy for use with the military. Today Tor is open source and is used by people all over the world including journalists, military, law enforcement, and activists. Its used by people who reside in countries, such as China, to obtain information that would otherwise be censored or restricted.

After learning this from the Tor site, I was extremely excited to play with this technology. One download and a few command lines later I was running the Tor browser package. I quickly looked through the settings to find what options were available what could be tweaked. I found that you could relay traffic for the Tor network. Cool! Providing uncensored internet access to people in oppressive countries is noble right? Turn that option on and additional options become available including some stuff about exit nodes.

What are exit nodes? They are relays that provide an exit point for data within the tor network. This is the point where the encrypted tor data is decrypted and forwarded onto the requested server. From the server's point of view, the traffic is coming from the exit node. So, enthusiastically I check most if not all the boxes available, allowing http, https, email, IM and IRC, as well as miscellaneous other services. Part of me also wanted to see what this data looks like coming across the network.

In the mean time I continued surfing around the internet reading more about exit nodes since this is a new concept to me. I found data describing how it works, how to configure it, and advice on how to run an exit node with minimal harassment from ISPs.

After an hour or two I noticed that there is a descent amount of traffic flowing through tor, about 20MB had been transfered. What does it look like? The traffic between tor nodes definitely encrypted and exiting web requests seemed to encrypted over https. I don't recall seeing any other data but who knows, I wasn't looking really hard.

This got me thinking, what kind requests had come out of my exit node? My first thought was bittorrent data or similar P2P data. I don't want to deal with calls from my ISP about DMCA take downs so I decided to stop running an exit node and just run in relay mode.

The next day I moved the relay to a hosted server rather than keeping it at my residence. I also contemplated running it as an exit node on the hosted server but ultimately decided against that. Initially I made this decision again because of possible DMCA take down notices, but also contemplated the possibility of attacks being launched from my exit node. Over the next few days as I kept reading about exit nodes and experiences with them. I ran across a hand full (about 3 or 4) horror stories of tor exit nodes being implicated in accessing illegal porn. That is definitely something I don't want to be tied up in. So, I feel that my decision to leave my tor node set to relay only mode is the correct decision. I'll leave exit nodes to people and organizations that can deal with the legal ramifications of abuse.

The only question in my mind right now is, during those two hours my exit node was running, what data was requested from my IP? I will probably never know. Of the approximately 2600 exit nodes, I don't think there was a very high probability that something terrible was requested from my slow rated exit node. I think most of the data transfered was relay data rather than exit data. But you never know. Here's to the police not kicking down my door.