Peer-To-Peer News - The Week In Review - September 10th, 22 - P2P-Zone

JackSpratts · 09-09-22, 06:20 AM

Since 2002

September 10th, 2022

Dead USB Drives Are Fine: Building a Reliable Sneakernet
John Goerzen

“OK,” you’re probably thinking. “John, you talk a lot about things like Gopher and personal radios, and now you want to talk about building a reliable network out of… USB drives?”

Well, yes. In fact, I’ve already done it.

What is sneakernet?

Normally, “sneakernet” is a sort of tongue-in-cheek reference to using disconnected storage to transport data or messages. By “disconnect storage” I mean anything like CD-ROMs, hard drives, SD cards, USB drives, and so forth. There are times when loading up 12TB on a device and driving it across town is just faster and easier than using the Internet for the same. And, sometimes you need to get data to places that have no Internet at all.

Another reason for sneakernet is security. For instance, if your backup system is online, and your systems being backed up are online, then it could become possible for an attacker to destroy both your primary copy of data and your backups. Or, you might use a dedicated computer with no network connection to do GnuPG (GPG) signing.

What about “reliable” sneakernet, then?

TCP is often considered a “reliable” protocol. That means that the sending side is generally able to tell if its message was properly received. As with most reliable protocols, we have these components:

1. After transmitting a piece of data, the sender retains it.
2. After receiving a piece of data, the receiver sends an acknowledgment (ACK) back to the sender.
3. Upon receiving the acknowledgment, the sender removes its buffered copy of the data.
4. If no acknowledgment is received at the sender, it retransmits the data, in case it gets lost in transit.
5. It reorders any packets that arrive out of order, so that the recipient’s data stream is ordered correctly.

Now, a lot of the things I just mentioned for sneakernet are legendarily unreliable. USB drives fail, CD-ROMs get scratched, hard drives get banged up. Think about putting these things in a bicycle bag or airline luggage. Some of them are going to fail.

You might think, “well, I’ll just copy files to a USB drive instead of move them, and once I get them onto the destination machine, I’ll delete them from the source.” Congratulations! You are a human retransmit algorithm! We should be able to automate this!

And we can.

Enter NNCP

NNCP is one of those things that almost defies explanation. It is a toolkit for building asynchronous networks. It can use as a carrier: a pipe, TCP network connection, a mounted filesystem (specifically intended for cases like this), and much more. It also supports multi-hop asynchronous routing and asynchronous meshing, but these are beyond the scope of this particular article.

NNCP’s transports that involve live communication between two hops already had all the hallmarks of being reliable; there was a positive ACK and retransmit. As of version 8.7.0, NNCP’s ACKs themselves can also be asynchronous – meaning that every NNCP transport can now be reliable.

Yes, that’s right. Your ACKs can flow over tapes and USB drives if you want them to.

I use this for archiving and backups.

If you aren’t already familiar with NNCP, you might take a look at my NNCP page. I also have a lot of blog posts about NNCP.

Those pages describe the basics of NNCP: the “packet” (the unit of transmission in NNCP, which can be tiny or many TB), the end-to-end encryption, and so forth. The new command we will now be interested in is nncp-ack.

The Basic Idea

Here are the basic steps to processing this stuff with NNCP:

1. First, we use nncp-xfer -rx to process incoming packets from the USB (or other media) device. This moves them into the NNCP inbound queue, deleting them from the media device, and verifies the packet integrity.
2. We use nncp-ack -node $NODE to create ACK packets responding to the packets we just loaded into the rx queue. It writes a list of generated ACKs onto fd 4, which we save off for later use.
3. We run nncp-toss -seen to process the incoming queue. The use of -seen causes NNCP to remember the hashes of packets seen before, so a duplicate of an already-seen packet will not be processed twice. This command also processes incoming ACKs for packets we’ve sent out previously; if they pass verification, the relevant packets are removed from the local machine’s tx queue.
4. Now, we use nncp-xfer -keep -tx -mkdir -node $NODE to send outgoing packets to a given node by writing them to a given directory on the media device. -keep causes them to remain in the outgoing queue.
5. Finally, we use the list of generated ACK packets saved off in step 2 above. That list is passed to nncp-rm -node $NODE -pkt < $FILE to remove those specific packets from the outbound queue. The reason is that there will never be an ACK of ACK packet (that would create an infinite loop), so if we don’t delete them in this manner, they would hang around forever.

You can see these steps follow the same basic outline on upstream’s nncp-ack page.

One thing to keep in mind: if anything else is running nncp-toss, there is a chance of a race condition between steps 1 and 2 (if nncp-toss gets to it first, it might not get an ack generated). This would sort itself out eventually, presumably, as the sender would retransmit and it would be ACKed later.

Further ideas

NNCP guarantees the integrity of packets, but not ordering between packets; if you need that, you might look into my Filespooler program. It is designed to work with NNCP and can provide ordered processing.

An example script

Here is a script you might try for this sort of thing. It may have more logic than you need – really, you just need the steps above – but hopefully it is clear.

Code:

#!/bin/bash

set -eo pipefail

MEDIABASE="/media/$USER"

# The local node name
NODENAME="`hostname`"

# All nodes.  NODENAME should be in this list.
ALLNODES="node1 node2 node3"

RUNNNCP=""
# If you need to sudo, use something like RUNNNCP="sudo -Hu nncp"
NNCPPATH="/usr/local/nncp/bin"

ACKPATH="`mktemp -d`"

# Process incoming packets.
#
# Parameters: $1 - the path to scan.  Must contain a directory
# named "nncp".
procrxpath () {
    while [ -n "$1" ]; do
        BASEPATH="$1/nncp"
        shift
        if ! [ -d "$BASEPATH" ]; then
            echo "$BASEPATH doesn't exist; skipping"
            continue
        fi

        echo " *** Incoming: processing $BASEPATH"
        TMPDIR="`mktemp -d`"

        # This rsync and the one below can help with
        # certain permission issues from weird foreign
        # media.  You could just eliminate it and
        # always use $BASEPATH instead of $TMPDIR below.
        rsync -rt "$BASEPATH/" "$TMPDIR/"

        # You may need these next two lines if using sudo as above.
        # chgrp -R nncp "$TMPDIR"
        # chmod -R g+rwX "$TMPDIR"
        echo "     Running nncp-xfer -rx"
        $RUNNNCP $NNCPPATH/nncp-xfer -progress -rx "$TMPDIR"

        for NODE in $ALLNODES; do
                if [ "$NODE" != "$NODENAME" ]; then
                        echo "     Running nncp-ack for $NODE"

                        # Now, we generate ACK packets for each node we will
                        # process.  nncp-ack writes a list of the created
                        # ACK packets to fd 4.  We'll use them later.
                        # If using sudo, add -C 5 after $RUNNNCP.
                        $RUNNNCP $NNCPPATH/nncp-ack -progress -node "$NODE" \
                           4>> "$ACKPATH/$NODE"
                fi
        done

        rsync --delete -rt "$TMPDIR/" "$BASEPATH/"
        rm -fr "$TMPDIR"
    done
}


proctxpath () {
    while [ -n "$1" ]; do
        BASEPATH="$1/nncp"
        shift
        if ! [ -d "$BASEPATH" ]; then
            echo "$BASEPATH doesn't exist; skipping"
            continue
        fi

        echo " *** Outgoing: processing $BASEPATH"
        TMPDIR="`mktemp -d`"
        rsync -rt "$BASEPATH/" "$TMPDIR/"
        # You may need these two lines if using sudo:
        # chgrp -R nncp "$TMPDIR"
        # chmod -R g+rwX "$TMPDIR"

        for DESTHOST in $ALLNODES; do
            if [ "$DESTHOST" = "$NODENAME" ]; then
                continue
            fi

            # Copy outgoing packets to this node, but keep them in the outgoing
            # queue with -keep.
            $RUNNNCP $NNCPPATH/nncp-xfer -keep -tx -mkdir -node "$DESTHOST" -progress "$TMPDIR"

            # Here is the key: that list of ACK packets we made above - now we delete them.
            # There will never be an ACK for an ACK, so they'd keep sending forever
            # if we didn't do this.
            if [ -f "$ACKPATH/$DESTHOST" ]; then
                echo "nncp-rm for node $DESTHOST"
                $RUNNNCP $NNCPPATH/nncp-rm -debug -node "$DESTHOST" -pkt < "$ACKPATH/$DESTHOST"
            fi

        done

        rsync --delete -rt "$TMPDIR/" "$BASEPATH/"
        rm -rf "$TMPDIR"

        # We only want to write stuff once.
        return 0
    done
}

procrxpath "$MEDIABASE"/*

echo " *** Initial tossing..."

# We make sure to use -seen to rule out duplicates.
$RUNNNCP $NNCPPATH/nncp-toss -progress -seen

proctxpath "$MEDIABASE"/*

echo "You can unmount devices now."

echo "Done."

https://changelog.complete.org/archi...ble-sneakernet

Free, Lossless…

FLAC
Josh Coalson

Format Overview

The basic structure of a FLAC stream is:

• The four byte string "fLaC"
• The STREAMINFO metadata block
• Zero or more other metadata blocks
• One or more audio frames

The first four bytes are to identify the FLAC stream. The metadata that follows contains all the information about the stream except for the audio data itself. After the metadata comes the encoded audio data.

METADATA

FLAC defines several types of metadata blocks (see the format page for the complete list). Metadata blocks can be any length and new ones can be defined. A decoder is allowed to skip any metadata types it does not understand. Only one is mandatory: the STREAMINFO block. This block has information like the sample rate, number of channels, etc., and data that can help the decoder manage its buffers, like the minimum and maximum data rate and minimum and maximum block size. Also included in the STREAMINFO block is the MD5 signature of the unencoded audio data. This is useful for checking an entire stream for transmission errors.

Other blocks allow for padding, seek tables, tags, cuesheets, and application-specific data. There are flac options for adding PADDING blocks or specifying seek points. FLAC does not require seek points for seeking but they can speed up seeks, or be used for cueing in editing applications.

Also, if you have a need of a custom metadata block, you can define your own and request an ID here. Then you can reserve a PADDING block of the correct size when encoding, and overwrite the padding block with your APPLICATION block after encoding. The resulting stream will be FLAC compatible; decoders that are aware of your metadata can use it and the rest will safely ignore it.

AUDIO DATA

After the metadata comes the encoded audio data. Audio data and metadata are not interleaved. Like most audio codecs, FLAC splits the unencoded audio data into blocks, and encodes each block separately. The encoded block is packed into a frame and appended to the stream. The reference encoder uses a single block size for the whole stream but the FLAC format does not require it.

BLOCKING

The block size is an important parameter to encoding. If it is too small, the frame overhead will lower the compression. If it is too large, the modeling stage of the compressor will not be able to generate an efficient model. Understanding FLAC's modeling will help you to improve compression for some kinds of input by varying the block size. In the most general case, using linear prediction on 44.1kHz audio, the optimal block size will be between 2-6 ksamples. flac defaults to a block size of 4096 in this case. Using the fast fixed predictors, a smaller block size is usually preferable because of the smaller frame header.

INTER-CHANNEL DECORRELATION

In the case of stereo input, once the data is blocked it is optionally passed through an inter-channel decorrelation stage. The left and right channels are converted to center and side channels through the following transformation: mid = (left + right) / 2, side = left - right. This is a lossless process, unlike joint stereo. For normal CD audio this can result in significant extra compression. flac has two options for this: -m always compresses both the left-right and mid-side versions of the block and takes the smallest frame, and -M, which adaptively switches between left-right and mid-side.

MODELING

In the next stage, the encoder tries to approximate the signal with a function in such a way that when the approximation is subracted, the result (called the residual, residue, or error) requires fewer bits-per-sample to encode. The function's parameters also have to be transmitted so they should not be so complex as to eat up the savings. FLAC has two methods of forming approximations: 1) fitting a simple polynomial to the signal; and 2) general linear predictive coding (LPC). I will not go into the details here, only some generalities that involve the encoding options.

First, fixed polynomial prediction (specified with -l 0) is much faster, but less accurate than LPC. The higher the maximum LPC order, the slower, but more accurate, the model will be. However, there are diminishing returns with increasing orders. Also, at some point (usually around order 9) the part of the encoder that guesses what is the best order to use will start to get it wrong and the compression will actually decrease slightly; at that point you will have to you will have to use the exhaustive search option -e to overcome this, which is significantly slower.

Second, the parameters for the fixed predictors can be transmitted in 3 bits whereas the parameters for the LPC model depend on the bits-per-sample and LPC order. This means the frame header length varies depending on the method and order you choose and can affect the optimal block size.

RESIDUAL CODING

Once the model is generated, the encoder subracts the approximation from the original signal to get the residual (error) signal. The error signal is then losslessly coded. To do this, FLAC takes advantage of the fact that the error signal generally has a Laplacian (two-sided geometric) distribution, and that there are a set of special Huffman codes called Rice codes that can be used to efficiently encode these kind of signals quickly and without needing a dictionary.

Rice coding involves finding a single parameter that matches a signal's distribution, then using that parameter to generate the codes. As the distribution changes, the optimal parameter changes, so FLAC supports a method that allows the parameter to change as needed. The residual can be broken into several contexts or partitions, each with it's own Rice parameter. flac allows you to specify how the partitioning is done with the -r option. The residual can be broken into 2^n partitions, by using the option -r n,n. The parameter n is called the partition order. Furthermore, the encoder can be made to search through m to n partition orders, taking the best one, by specifying -r m,n. Generally, the choice of n does not affect encoding speed but m,n does. The larger the difference between m and n, the more time it will take the encoder to search for the best order. The block size will also affect the optimal order.

FRAMING

An audio frame is preceded by a frame header and trailed by a frame footer. The header starts with a sync code, and contains the minimum information necessary for a decoder to play the stream, like sample rate, bits per sample, etc. It also contains the block or sample number and an 8-bit CRC of the frame header. The sync code, frame header CRC, and block/sample number allow resynchronization and seeking even in the absence of seek points. The frame footer contains a 16-bit CRC of the entire encoded frame for error detection. If the reference decoder detects a CRC error it will generate a silent block.

MISCELLANEOUS

As a convenience, the reference decoder knows how to skip ID3v1 and ID3v2 tags. Note however that the FLAC specification does not require compliant implementations to support ID3 in any form and their use is strongly discouraged.

flac has a verify option -V that verifies the output while encoding. With this option, a decoder is run in parallel to the encoder and its output is compared against the original input. If a difference is found flac will stop with an error.
https://xiph.org/flac/documentation_..._overview.html

Firefox Addon Makes Pirating Music Easier
Nick Caiello

The Amazon MP3 Store may have the lowest prices on DRM free music, but for some people 79 cents for a song is just too much, especially when [john] and the folks at pirates-of-the-amazon.com can help you get that song for free. Pirates of the Amazon is a slick Firefox addon that inserts a “download 4 free” button next to the “add to cart” button in the Amazon MP3 Store. After clicking on the button, the addon refers users to a thepiratebay.org search page with bittorrent download links for the song or album. While there is no question that this makes getting your music easier, by using this addon you do run the risk of violating copyright laws, depending on which country you live in.

There isn’t much here that hasn’t been thrown into Greasemonkey scripts in the past and we wonder if they’re marketing this to anyone at all. People who absolutely love using Amazon but hate buying stuff perhaps? They cite a couple interesting projects in their about section: Amazon Noir robotically abused the “Search Inside” feature to reconstruct entire books. OU Library searches your local library to see if it has the Amazon book you’re looking for.
https://hackaday.com/2008/12/03/fire...rating-easier/

Until next week,

- js.

Current Week In Review

Recent WiRs -

September 3rd, August 27th, August 20th, August 13th

Jack Spratts' Week In Review is published every Friday. Submit letters, articles, press releases, comments, questions etc. in plain text English to jackspratts (at) lycos (dot) com. Submission deadlines are Thursdays @ 1400 UTC. Please include contact info. The right to publish all remarks is reserved.

"The First Amendment rests on the assumption that the widest possible dissemination of information from diverse and antagonistic sources is essential to the welfare of the public." - Hugo Black

09-09-22, 06:20 AM	#1
JackSpratts Join Date: May 2001 Location: New England Posts: 10,015	Peer-To-Peer News - The Week In Review - September 10th, 22 Since 2002 September 10th, 2022 Dead USB Drives Are Fine: Building a Reliable Sneakernet John Goerzen “OK,” you’re probably thinking. “John, you talk a lot about things like Gopher and personal radios, and now you want to talk about building a reliable network out of… USB drives?” Well, yes. In fact, I’ve already done it. What is sneakernet? Normally, “sneakernet” is a sort of tongue-in-cheek reference to using disconnected storage to transport data or messages. By “disconnect storage” I mean anything like CD-ROMs, hard drives, SD cards, USB drives, and so forth. There are times when loading up 12TB on a device and driving it across town is just faster and easier than using the Internet for the same. And, sometimes you need to get data to places that have no Internet at all. Another reason for sneakernet is security. For instance, if your backup system is online, and your systems being backed up are online, then it could become possible for an attacker to destroy both your primary copy of data and your backups. Or, you might use a dedicated computer with no network connection to do GnuPG (GPG) signing. What about “reliable” sneakernet, then? TCP is often considered a “reliable” protocol. That means that the sending side is generally able to tell if its message was properly received. As with most reliable protocols, we have these components: 1. After transmitting a piece of data, the sender retains it. 2. After receiving a piece of data, the receiver sends an acknowledgment (ACK) back to the sender. 3. Upon receiving the acknowledgment, the sender removes its buffered copy of the data. 4. If no acknowledgment is received at the sender, it retransmits the data, in case it gets lost in transit. 5. It reorders any packets that arrive out of order, so that the recipient’s data stream is ordered correctly. Now, a lot of the things I just mentioned for sneakernet are legendarily unreliable. USB drives fail, CD-ROMs get scratched, hard drives get banged up. Think about putting these things in a bicycle bag or airline luggage. Some of them are going to fail. You might think, “well, I’ll just copy files to a USB drive instead of move them, and once I get them onto the destination machine, I’ll delete them from the source.” Congratulations! You are a human retransmit algorithm! We should be able to automate this! And we can. Enter NNCP NNCP is one of those things that almost defies explanation. It is a toolkit for building asynchronous networks. It can use as a carrier: a pipe, TCP network connection, a mounted filesystem (specifically intended for cases like this), and much more. It also supports multi-hop asynchronous routing and asynchronous meshing, but these are beyond the scope of this particular article. NNCP’s transports that involve live communication between two hops already had all the hallmarks of being reliable; there was a positive ACK and retransmit. As of version 8.7.0, NNCP’s ACKs themselves can also be asynchronous – meaning that every NNCP transport can now be reliable. Yes, that’s right. Your ACKs can flow over tapes and USB drives if you want them to. I use this for archiving and backups. If you aren’t already familiar with NNCP, you might take a look at my NNCP page. I also have a lot of blog posts about NNCP. Those pages describe the basics of NNCP: the “packet” (the unit of transmission in NNCP, which can be tiny or many TB), the end-to-end encryption, and so forth. The new command we will now be interested in is nncp-ack. The Basic Idea Here are the basic steps to processing this stuff with NNCP: 1. First, we use nncp-xfer -rx to process incoming packets from the USB (or other media) device. This moves them into the NNCP inbound queue, deleting them from the media device, and verifies the packet integrity. 2. We use nncp-ack -node $NODE to create ACK packets responding to the packets we just loaded into the rx queue. It writes a list of generated ACKs onto fd 4, which we save off for later use. 3. We run nncp-toss -seen to process the incoming queue. The use of -seen causes NNCP to remember the hashes of packets seen before, so a duplicate of an already-seen packet will not be processed twice. This command also processes incoming ACKs for packets we’ve sent out previously; if they pass verification, the relevant packets are removed from the local machine’s tx queue. 4. Now, we use nncp-xfer -keep -tx -mkdir -node $NODE to send outgoing packets to a given node by writing them to a given directory on the media device. -keep causes them to remain in the outgoing queue. 5. Finally, we use the list of generated ACK packets saved off in step 2 above. That list is passed to nncp-rm -node $NODE -pkt < $FILE to remove those specific packets from the outbound queue. The reason is that there will never be an ACK of ACK packet (that would create an infinite loop), so if we don’t delete them in this manner, they would hang around forever. You can see these steps follow the same basic outline on upstream’s nncp-ack page. One thing to keep in mind: if anything else is running nncp-toss, there is a chance of a race condition between steps 1 and 2 (if nncp-toss gets to it first, it might not get an ack generated). This would sort itself out eventually, presumably, as the sender would retransmit and it would be ACKed later. Further ideas NNCP guarantees the integrity of packets, but not ordering between packets; if you need that, you might look into my Filespooler program. It is designed to work with NNCP and can provide ordered processing. An example script Here is a script you might try for this sort of thing. It may have more logic than you need – really, you just need the steps above – but hopefully it is clear. Code: #!/bin/bash set -eo pipefail MEDIABASE="/media/$USER" # The local node name NODENAME="`hostname`" # All nodes. NODENAME should be in this list. ALLNODES="node1 node2 node3" RUNNNCP="" # If you need to sudo, use something like RUNNNCP="sudo -Hu nncp" NNCPPATH="/usr/local/nncp/bin" ACKPATH="`mktemp -d`" # Process incoming packets. # # Parameters: $1 - the path to scan. Must contain a directory # named "nncp". procrxpath () { while [ -n "$1" ]; do BASEPATH="$1/nncp" shift if ! [ -d "$BASEPATH" ]; then echo "$BASEPATH doesn't exist; skipping" continue fi echo " * Incoming: processing $BASEPATH" TMPDIR="`mktemp -d`" # This rsync and the one below can help with # certain permission issues from weird foreign # media. You could just eliminate it and # always use $BASEPATH instead of $TMPDIR below. rsync -rt "$BASEPATH/" "$TMPDIR/" # You may need these next two lines if using sudo as above. # chgrp -R nncp "$TMPDIR" # chmod -R g+rwX "$TMPDIR" echo " Running nncp-xfer -rx" $RUNNNCP $NNCPPATH/nncp-xfer -progress -rx "$TMPDIR" for NODE in $ALLNODES; do if [ "$NODE" != "$NODENAME" ]; then echo " Running nncp-ack for $NODE" # Now, we generate ACK packets for each node we will # process. nncp-ack writes a list of the created # ACK packets to fd 4. We'll use them later. # If using sudo, add -C 5 after $RUNNNCP. $RUNNNCP $NNCPPATH/nncp-ack -progress -node "$NODE" \ 4>> "$ACKPATH/$NODE" fi done rsync --delete -rt "$TMPDIR/" "$BASEPATH/" rm -fr "$TMPDIR" done } proctxpath () { while [ -n "$1" ]; do BASEPATH="$1/nncp" shift if ! [ -d "$BASEPATH" ]; then echo "$BASEPATH doesn't exist; skipping" continue fi echo " * Outgoing: processing $BASEPATH" TMPDIR="`mktemp -d`" rsync -rt "$BASEPATH/" "$TMPDIR/" # You may need these two lines if using sudo: # chgrp -R nncp "$TMPDIR" # chmod -R g+rwX "$TMPDIR" for DESTHOST in $ALLNODES; do if [ "$DESTHOST" = "$NODENAME" ]; then continue fi # Copy outgoing packets to this node, but keep them in the outgoing # queue with -keep. $RUNNNCP $NNCPPATH/nncp-xfer -keep -tx -mkdir -node "$DESTHOST" -progress "$TMPDIR" # Here is the key: that list of ACK packets we made above - now we delete them. # There will never be an ACK for an ACK, so they'd keep sending forever # if we didn't do this. if [ -f "$ACKPATH/$DESTHOST" ]; then echo "nncp-rm for node $DESTHOST" $RUNNNCP $NNCPPATH/nncp-rm -debug -node "$DESTHOST" -pkt < "$ACKPATH/$DESTHOST" fi done rsync --delete -rt "$TMPDIR/" "$BASEPATH/" rm -rf "$TMPDIR" # We only want to write stuff once. return 0 done } procrxpath "$MEDIABASE"/* echo " *** Initial tossing..." # We make sure to use -seen to rule out duplicates. $RUNNNCP $NNCPPATH/nncp-toss -progress -seen proctxpath "$MEDIABASE"/* echo "You can unmount devices now." echo "Done." https://changelog.complete.org/archi...ble-sneakernet Free, Lossless… FLAC Josh Coalson Format Overview The basic structure of a FLAC stream is: • The four byte string "fLaC" • The STREAMINFO metadata block • Zero or more other metadata blocks • One or more audio frames The first four bytes are to identify the FLAC stream. The metadata that follows contains all the information about the stream except for the audio data itself. After the metadata comes the encoded audio data. METADATA FLAC defines several types of metadata blocks (see the format page for the complete list). Metadata blocks can be any length and new ones can be defined. A decoder is allowed to skip any metadata types it does not understand. Only one is mandatory: the STREAMINFO block. This block has information like the sample rate, number of channels, etc., and data that can help the decoder manage its buffers, like the minimum and maximum data rate and minimum and maximum block size. Also included in the STREAMINFO block is the MD5 signature of the unencoded audio data. This is useful for checking an entire stream for transmission errors. Other blocks allow for padding, seek tables, tags, cuesheets, and application-specific data. There are flac options for adding PADDING blocks or specifying seek points. FLAC does not require seek points for seeking but they can speed up seeks, or be used for cueing in editing applications. Also, if you have a need of a custom metadata block, you can define your own and request an ID here. Then you can reserve a PADDING block of the correct size when encoding, and overwrite the padding block with your APPLICATION block after encoding. The resulting stream will be FLAC compatible; decoders that are aware of your metadata can use it and the rest will safely ignore it. AUDIO DATA After the metadata comes the encoded audio data. Audio data and metadata are not interleaved. Like most audio codecs, FLAC splits the unencoded audio data into blocks, and encodes each block separately. The encoded block is packed into a frame and appended to the stream. The reference encoder uses a single block size for the whole stream but the FLAC format does not require it. BLOCKING The block size is an important parameter to encoding. If it is too small, the frame overhead will lower the compression. If it is too large, the modeling stage of the compressor will not be able to generate an efficient model. Understanding FLAC's modeling will help you to improve compression for some kinds of input by varying the block size. In the most general case, using linear prediction on 44.1kHz audio, the optimal block size will be between 2-6 ksamples. flac defaults to a block size of 4096 in this case. Using the fast fixed predictors, a smaller block size is usually preferable because of the smaller frame header. INTER-CHANNEL DECORRELATION In the case of stereo input, once the data is blocked it is optionally passed through an inter-channel decorrelation stage. The left and right channels are converted to center and side channels through the following transformation: mid = (left + right) / 2, side = left - right. This is a lossless process, unlike joint stereo. For normal CD audio this can result in significant extra compression. flac has two options for this: -m always compresses both the left-right and mid-side versions of the block and takes the smallest frame, and -M, which adaptively switches between left-right and mid-side. MODELING In the next stage, the encoder tries to approximate the signal with a function in such a way that when the approximation is subracted, the result (called the residual, residue, or error) requires fewer bits-per-sample to encode. The function's parameters also have to be transmitted so they should not be so complex as to eat up the savings. FLAC has two methods of forming approximations: 1) fitting a simple polynomial to the signal; and 2) general linear predictive coding (LPC). I will not go into the details here, only some generalities that involve the encoding options. First, fixed polynomial prediction (specified with -l 0) is much faster, but less accurate than LPC. The higher the maximum LPC order, the slower, but more accurate, the model will be. However, there are diminishing returns with increasing orders. Also, at some point (usually around order 9) the part of the encoder that guesses what is the best order to use will start to get it wrong and the compression will actually decrease slightly; at that point you will have to you will have to use the exhaustive search option -e to overcome this, which is significantly slower. Second, the parameters for the fixed predictors can be transmitted in 3 bits whereas the parameters for the LPC model depend on the bits-per-sample and LPC order. This means the frame header length varies depending on the method and order you choose and can affect the optimal block size. RESIDUAL CODING Once the model is generated, the encoder subracts the approximation from the original signal to get the residual (error) signal. The error signal is then losslessly coded. To do this, FLAC takes advantage of the fact that the error signal generally has a Laplacian (two-sided geometric) distribution, and that there are a set of special Huffman codes called Rice codes that can be used to efficiently encode these kind of signals quickly and without needing a dictionary. Rice coding involves finding a single parameter that matches a signal's distribution, then using that parameter to generate the codes. As the distribution changes, the optimal parameter changes, so FLAC supports a method that allows the parameter to change as needed. The residual can be broken into several contexts or partitions, each with it's own Rice parameter. flac allows you to specify how the partitioning is done with the -r option. The residual can be broken into 2^n partitions, by using the option -r n,n. The parameter n is called the partition order. Furthermore, the encoder can be made to search through m to n partition orders, taking the best one, by specifying -r m,n. Generally, the choice of n does not affect encoding speed but m,n does. The larger the difference between m and n, the more time it will take the encoder to search for the best order. The block size will also affect the optimal order. FRAMING An audio frame is preceded by a frame header and trailed by a frame footer. The header starts with a sync code, and contains the minimum information necessary for a decoder to play the stream, like sample rate, bits per sample, etc. It also contains the block or sample number and an 8-bit CRC of the frame header. The sync code, frame header CRC, and block/sample number allow resynchronization and seeking even in the absence of seek points. The frame footer contains a 16-bit CRC of the entire encoded frame for error detection. If the reference decoder detects a CRC error it will generate a silent block. MISCELLANEOUS As a convenience, the reference decoder knows how to skip ID3v1 and ID3v2 tags. Note however that the FLAC specification does not require compliant implementations to support ID3 in any form and their use is strongly discouraged. flac has a verify option -V that verifies the output while encoding. With this option, a decoder is run in parallel to the encoder and its output is compared against the original input. If a difference is found flac will stop with an error. https://xiph.org/flac/documentation_..._overview.html Firefox Addon Makes Pirating Music Easier Nick Caiello The Amazon MP3 Store may have the lowest prices on DRM free music, but for some people 79 cents for a song is just too much, especially when [john] and the folks at pirates-of-the-amazon.com can help you get that song for free. Pirates of the Amazon is a slick Firefox addon that inserts a “download 4 free” button next to the “add to cart” button in the Amazon MP3 Store. After clicking on the button, the addon refers users to a thepiratebay.org search page with bittorrent download links for the song or album. While there is no question that this makes getting your music easier, by using this addon you do run the risk of violating copyright laws, depending on which country you live in. There isn’t much here that hasn’t been thrown into Greasemonkey scripts in the past and we wonder if they’re marketing this to anyone at all. People who absolutely love using Amazon but hate buying stuff perhaps? They cite a couple interesting projects in their about section: Amazon Noir robotically abused the “Search Inside” feature to reconstruct entire books. OU Library searches your local library to see if it has the Amazon book you’re looking for. https://hackaday.com/2008/12/03/fire...rating-easier/ Until next week, - js. Current Week In Review Recent WiRs - September 3rd, August 27th, August 20th, August 13th Jack Spratts' Week In Review is published every Friday. Submit letters, articles, press releases, comments, questions etc. in plain text English to jackspratts (at) lycos (dot) com. Submission deadlines are Thursdays @ 1400 UTC. Please include contact info. The right to publish all remarks is reserved. "The First Amendment rests on the assumption that the widest possible dissemination of information from diverse and antagonistic sources is essential to the welfare of the public." - Hugo Black __________________ Thanks For Sharing

Thread Tools	Search this Thread
Show Printable Version Email this Page	Search this Thread: Advanced Search
Display Modes
Linear Mode Switch to Hybrid Mode Switch to Threaded Mode

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
Peer-To-Peer News - The Week In Review - November 24th, '12	JackSpratts	Peer to Peer	0	21-11-12 09:20 AM
Peer-To-Peer News - The Week In Review - July 16th, '11	JackSpratts	Peer to Peer	0	13-07-11 06:43 AM
Peer-To-Peer News - The Week In Review - January 30th, '10	JackSpratts	Peer to Peer	0	27-01-10 07:49 AM
Peer-To-Peer News - The Week In Review - January 16th, '10	JackSpratts	Peer to Peer	0	13-01-10 09:02 AM
Peer-To-Peer News - The Week In Review - December 5th, '09	JackSpratts	Peer to Peer	0	02-12-09 08:32 AM