Saturday, October 15, 2011

Recovering RAID5 from Multiple Simultaneous Drive Failures

RAID5 is a redundant disk system that protects against a single drive failure. The array can keep functioning allowing you the defective disk and rebuild the raid without any data loss. However if more than one disk fails at a time, RAID5 will not help you (there are other raid levels that can). Sudden multiple disk failure is exactly what happened to my system one night.

Edit (9/15/2012): I added additional information at the bottom of this post which makes the re-assembly process easier. I recommend it rather than the --create procedure detailed below. It is still a good idea to read the entire post though.

Once or twice in the past I have had a single drive fail because of a lose or fault SATA cable. This is easily resolved by powering down the computer and re-securing the cable. I usually notice a drive failure within a week (I should setup and alert system). But recently, I had two drives fail within two hours of each other. I hadn't even noticed the first drive failure before the second had drive failed. Rebooting the computer cleared up the SATA errors that brought the drives down. The drives seemed to be function properly, they hadn't suffered a hardware failure. However, the raid could not rebuild itself because linux had marked both drives as faulty. At this point in time I had 6TB of data at risk, with partial backups several months old. I was mostly worried about photos that I had taken over the past several months that can't be replaced.

So what to do... try not to panic, this is going to get messy.

I began by trying to figure out which drives failed and in which order by issuing mdadm --examine for every device in the array. I focused on the last portion of the output, which contains the status of each device. The data is recorded independently on each device in the raid, so you can compare the output and find differences. In a properly functioning raid the output should be identical for each device. Below is the output for the /dev/sda1 device.

Number Major Minor RaidDevice State
this 3 8 81 3 active sync /dev/sdf1

0 0 8 17 0 active sync /dev/sdb1
1 1 0 0 1 faulty removed
2 2 0 0 2 faulty removed
3 3 8 81 3 active sync /dev/sdf1
4 4 8 97 4 active sync /dev/sdg1


Knowing that I lost 2 drives, I figured that this drive had not failed simply because the failures were recorded on this disk.

Continuing, I eventually found the second drive that failed because it only had a record of one drive failure. Thus drive must have been functioning during the first failure but not the 2nd.
Number Major Minor RaidDevice State
this 1 8 33 1 active sync /dev/sdc1

0 0 8 17 0 active sync /dev/sdb1
1 1 8 33 1 active sync /dev/sdc1
2 2 0 0 2 faulty removed
3 3 8 81 3 active sync /dev/sdf1
4 4 8 97 4 active sync /dev/sdg1


Then I found the first drive that failed because there was no failure recorded on it at all.

Number Major Minor RaidDevice State
this 2 8 49 2 active sync /dev/sdd1

0 0 8 17 0 active sync /dev/sdb1
1 1 8 33 1 active sync /dev/sdc1
2 2 8 49 2 active sync /dev/sdd1
3 3 8 81 3 active sync /dev/sdf1
4 4 8 97 4 active sync /dev/sdg1


Note: I think you can also use the "Update Time" from mdadm --examine to figure this information out. I used it to verify that my logic was correct.

Important Note: My computer likes to move SATA devices around at boot time. So the device names listed in the raid status outputs were not accurate after rebooting. The above mdadm output says /dev/sdd1 had failed, but the device name I queried mdadm for was /dev/sdf1. You MUST match the current device with the array device number. The correct device order is essential for fixing the raid.

Now that I knew the devices and the order in which they failed I could do a little more thinking.

I figured since linux halted the file system and stopped the raid when the 2nd device failed there shouldn't be too much data corruption. Probably only the data that was being written to disk near the time of failure. This data wasn't too important. The last important data had been written a few days earlier so t should have been flushed from the caches to disk. With these assumptions I decided that it should be possible to just tell the array that only the 1st failed drive is broken and that the second is OK. Apparently you can't really do this. The only way to do it is to destroy the array and rebuild it.

Thats right, destroy and then rebuild the array. Pray for all the datas.

I googled around for a while to try to see if my idea was possible, it seemed to be. The best validation for my idea came from this blog.

The rebuilding process...

The first step was to stop the raid device mdadm --stop before I could start destroying and re-creating it. If you don't do this, you get strange errors from mdadm saying it can't write to the devices. It was aggravating to figure out why so just do it.

I decided I needed to protect myself from myself and possibly from mdadm. I wanted to make sure there was no chance that I would accidently rebuild the array using the first failed (most out of sync) drive. I zeroed the device's raid superblock. mdadm --zero-superblock /dev/sdf1. Now it is no longer associated with any raid device.

Next I used the output from the mdadm --examine commands to help me construct the command to rebuild the raid.
mdadm --verbose --create --metadata=0.90 /dev/md0 --chunk=128 --level=5 --raid-devices=5 /dev/sdd1 /dev/sde1 missing /dev/sda1 /dev/sdb1


IMPORTANT: Notice that the device order is not the same order as listed in the mdadm --examine output. This is because my computer moves the SATA devices around. It is CRITICAL that you rebuild array with the devices in the proper order. Use the array device number for "this" device from the output of the mdadm --examine commands to help you order the devices correctly.

I specified the chunk size using the value from mdadm --examine. I found I also had to specify the meta data version. Mdadm by default used a newer meta data version, which altered the amount of space on each device. The space used for the rebuild needed to be exactly the same as the original array setup otherwise the data stripes won't line up (and your data will be munged). You can rebuild the array as many times as you like so long as you don't write data to the broken array setup. I rebuilt my array 3 or 4 times before I got it right.

To check if the array setup was correct I ran e2fsck -B 4096 -n /dev/md0 (linux scan disk utility). I decided it was safer to specify the file system block size to make sure e2fsck got it right. Since I am just testing the array setup, I didn't want e2fsck making any changes to the disk hence the -n. If the array setup is incorrect the striped data won't line up and e2fsck won't find any superblock and will refuse to scan. If e2fsck is able to preform a scan, then the array setup must be OK (at least that's the hope).

The next part is probably the most dangerous part because at this point you are editing the file system data on the disks.

Once I was sure the array setup was correct I ran e2fsck -B 4096 /dev/md0 to fix all the file system errors. There were thousands of group errors, a few dozen inode errors, and a lot of bitmap errors. The wait was nerve racking but eventually it finished. I was able to mount the file system as read only (for safety), and list the files, I was even able to open a picture.

Lastly I added the first failed drive back into the array mdadm -a /dev/md0 /dev/sdf1 and the array began rebuilding the parity bits.

At this point I began dumping all important data to another drive just to have a current backup. Once the parity bits have been rebuilt I will remount the partition as read-write.

That's it, RAID5 recovered from multiple drive failure.

Edit (9/15/2012): I found another page that has helpful advice. http://zackreed.me/articles/50-recovery-from-a-multiple-disk-failure-with-mdadm  Instead of using --create, you can use --assemble --force with the drives that you want mark as clean and use.   This will assemble the array in degraded mode with the devices in the correct order.  You can then zero the super block of the 1st failed drive and then --add it to the array.

Wednesday, February 2, 2011

Verizon DSL Throttling Video Data

It appears Verizon DSL is throttling video data. My roommate and I noticed a few days ago that videos were playing terribly. It didn't matter what site it was streaming from, youtube, hulu, etc, all videos were buffering extremely slowly. There was no abnormal bandwidth usage at the time. Today I decided to do some testing...

Originally I used Chrome for the test, but repeated it in Firefox so that I could get accurate timing data.

Here is my test:

I connected to my encrypted VPN service. I started Firefox and used private mode to avoid caching. I loaded a video from youtube. Below is a screen shot of the load time for the video while using Verizon DSL with an encrypted VPN. Notice it took 30 seconds to load 2.9 MB of video.



Next, I closed the browser and disconnected from the VPN. I started firefox in private mode again (to avoid caching) and loaded the same video. This time it took 1 minute and 57 seconds to load 2.9MB of video! This is 4 times slower! Verizon is clearly throttling the video. Image below.



I urge people using Verizon (or any ISP) to perform their own tests. Make these incidents known!

EDIT: At this point I have only tested with youtube. I may test with other sites in the future. Subjectively, Hulu seems ok today.

Friday, January 29, 2010

flac2mp3

Being unsatisfied with other flac to mp3 conversion tools out there, I wrote a quick perl script to get the job done. It even handles ID3 tags. Lack of this feature was my problem with other tools.

The command would be run by ./flac2mp3 *.flac


#!/usr/bin/perl

foreach $argnum (0 .. $#ARGV)
{
my $flac = $ARGV[$argnum];
$flac =~ s/.flac$//;
$flac =~ s/"/\\"/g; #escape "

my $tagdata = `metaflac --export-tags-to=- \"$flac.flac\"`;

$tagdata =~ s/"/\\"/g; #escape "

# print "$tagdata\n";
$tagdata =~ s/TITLE=/--tt "/;
$tagdata =~ s/ALBUM=/--tl "/;
$tagdata =~ s/ARTIST=/--ta "/;
$tagdata =~ s/GENRE=/--tg "/;
$tagdata =~ s/TRACKNUMBER=/--tn "/;
$tagdata =~ s/DATE=.*(\d{4}).*/--ty "$1/;
# $tagdata =~ s/DISCNUMBER=/--tv cd="/;


$tagdata =~ s/\n/" \n/gm; #add " and space to the end of every line
$tagdata =~ s/^[^--].*|\n//gm; #remove extra data and make all one line
# print "$tagdata\n";

# print "flac -d \"$flac.flac\" -o - | lame -V 2 -h $tagdata - -o \"$flac.mp3\"\n";

system("flac -d \"$flac.flac\" -o - | lame -V 1 -h $tagdata - -o \"$flac.mp3\"");
}

Tuesday, July 21, 2009

Preventing SQL injection (again)

Recently I had to update an old PERL program which, when it was originally written, had no sanitation of user input for SQL statements. The user input (from the web) was simply concatenated into SQL statements. This made it very vulnerable to SQL injection.

The SQL DBI used in the program did not allow parameterized queries and replacing it with a newer DBI would have required massive logic changes to the program. The solution was to figure out how to properly escape special characters present in the input. This turned out to be pretty simple if the input was surrounded by single quotes within the SQL statement. Assuming this is true, single quotes present in the input can be replace with with two single quotes. This will protect the SQL from injection.

Why? ANSI SQL says that a single quote is escaped by inserting an additional single quote directly before it. Escaping single quotes makes it very difficult if not impossible for the input to terminate the SQL string. However, this only works (at least on informix) if the input string is surround by single quotes in the SQL. Input strings surrounded by double quotes can not be escaped.

This method, combined with expanding function calls within strings, I was able to prevent SQL injection without major DBI and logic changes.

Thursday, July 9, 2009

Been a while

Its been a while since my last post and a lot has happened since.

Recently (in my spare time) I have been focusing on the second version of my game engine, Mercury.
Development is picking up speed and the project is really taking shape. My personal goal is to get the engine functioning enough to make a few short games. The previous version of the engine was successfully used by the UMBC game development club for their year long 3D project. I'll probably start writing about graphics programming more than anything else.

Saturday, January 31, 2009

Writing secure SQL applications

When writing applications that make use of SQL, specifically applications that live on the web, security should be a high priority. Unfortunately security usually ends up just an afterthought. In my experience reviewing and maintaining web applications written by others, I have found that they take little to no precaution against SQL injection.

SQL injection is the practice of crafting user input to alter the function of a dynamically generated SQL statement. In web based langagues, SQL statements are usually constructed using string concatenation to combine the query statements with the query values. This can lead to very dangerous conditions. Consider the following simple query.

select username from user_table where email='myemail@email.com'

Assuming the email address is inserted into the query using string concatenation it is trivial to alter how the query functions. If I entered my e-mail as:

myemail@email.com'; drop table usertable; --

The resulting query would be:

select username from user_table where email='myemail@email.com'; drop table usertable; --'

This would instruct the SQL server to drop the table (assuming the application has adequate permissions). Of course you can construct any statement you wish to manipulate the SQL server.

The usual protection against this type of attack to to escape special characters such as ' and ;. This can help improve security but is not fool proof.

Consider the following:

select username from user_table where id=123456

If the user id could be manipulated by the user it would be possible to make the id something like:

123456 and 1=(delete from user table where id != 123456)

The resulting query would be:

select username from user_table where id=123456 and 0<(delete from user_table where id != 123456)

This would instruct the SQL server to delete all users who's id is not 123456. Notice we have not used any special characters so escaping would not help in this situation.

Now there is a rather nice solution to these problems, parameterized queries. Parameterized queries allow you to prepare queries and then send in values at execution time.

Using the last example, the parameterized query would look like this:

select username from user_table where id=?

Parameters are usually indicated with a ? but may depend on the SQL library.The query is prepared by using something similar to:

$query = $db->prepare("select username from user_table where id=?");

We only need to do that once.

Then we can execute it as many times as we want with something similar to:

$result = $query->execute("123456");

The neat thing with this is that the database library will handle inserting the parameters into the query. There is no need to escape special characters using this method. I would argue this using this method makes SQL injection extremely difficult, if not impossible.

Using parameterized queries is usually a little more work than just concatenating strings, but the benefits are well worth the extra effort.

I have used parameterized queries using both PERL and PHP with Informix and mySQL databases. The PERL module is DBI, the php class is mysqli. They function differently but the concept is the same.

Sunday, November 16, 2008

PS3 MP4 Encoding with Linux

After a lot of reading, searching, and trial and error, I have come up with a fairly simple way to convert video and audio content into an h264 and aac stream. The stream is packed into an MP4 file and can be streamed to a PS3 via a media server such as MediaTomb.

The following scripts makes it a pretty automated process. It relies on mencoder, and MP4Box. I have been using it to convert DVD vob files into smaller MP4 files, with nearly the same quality. I can't actually tell a difference from the original DVD source.


#!/bin/bash
VIDEOFILTER=crop=704:480:8:0
ENCOPTS=subq=5:bframes=4:b_pyramid:weight_b:psnr:frameref=3:bitrate=$2:turbo=1:me=hex:partitions=all:8x8dct:qcomp=0.7:threads=auto

mencoder -v \
$1 \
-alang en \
-vf $VIDEOFILTER \
-ovc x264 -x264encopts $ENCOPTS:pass=1:turbo=1 \
-ofps 24000/1001 \
-vobsubout "$3_subtitles" -vobsuboutindex 0 -slang en \
-passlogfile "$3.log" \
-oac copy \
-o /dev/null

mencoder -v \
$1 \
-alang en \
-vf $VIDEOFILTER \
-ovc x264 -x264encopts $ENCOPTS:pass=2 \
-oac faac -faacopts object=1:tns:quality=150 \
-passlogfile "$3.log" \
-ofps 24000/1001 \
-o "$3.avi"

MP4Box -aviraw video "$3.avi"

MP4Box -aviraw audio "$3.avi"

mv "$3_audio.raw" "$3_audio.aac"

rm "$3.mp4"

MP4Box -isma -hint -add "$3_video.h264#Video:fps=23.976" -add "$3_audio.aac" "$3.mp4"

rm "$3_video.h264" "$3_audio.aac"


Running the script involves something like...

./x264Encode2.sh Terminator2.vob 3800 Terminator2


The first argument is the source file, the second is the target bitrate, and the 3rd is the target file. I do not put an extension on the target as the script will do this. A bitrate of 3800 seems high enough to produce an encode that looks nearly identical to the source DVD.

You will probably need to tweak the VIDEOFILTER crop filter according to your video source. You can run mplayer on your source using -vf cropdetect to determine the correct cropping arguments. Of course you can also change the ENCOPTS to your liking, although I find the current options acceptable.

Once the oncode is finished you have a nice MP4 file. The avi file is left after the encode in-case something go wrong so you don't have to re-encode the source. If everything goes ok, you can delete this file.

Let me know what you think.

EDIT: The PS3 seems to be picky about the video resolutions, be careful when cropping to strange resolutions. I'll try to investigate this more, as I ran into a problem with a cropped wide screen video. Removing or tweaking the crop made a short test encode work.