Edited Chapter 2

This commit is contained in:
Kenneth John Odle 2021-11-06 09:27:20 -04:00
parent e0c2199aea
commit 479a14079f
3 changed files with 22 additions and 16 deletions

Binary file not shown.

View File

@ -106,27 +106,29 @@ Boring, early life stuff when my world smelled like sweat and disinfectant and r
\chapter{A Scanner Clearly, or More Thoughts on Being an Archivist} \chapter{A Scanner Clearly, or More Thoughts on Being an Archivist}
In the first issue of this zine, I wrote about a basic workflow for archiving books through scanning them into pdf files. While I covered about everything I wanted to cover on the computer end of things, I barely talked at all about the physical labor that goes into scanning this book. For those who are interested, here's what happens behind the scenes before you even get to the computer. In the first issue of this zine, I wrote about a basic workflow for archiving books through scanning them into pdf files. While I covered about everything I wanted to cover on the computer end of things, I barely talked at all about the physical labor that goes into scanning a book. For those who are interested, here's what happens behind the scenes before you even get to the computer.
\section{The Non-Computer Stuff} \section{The Non-Computer Stuff}
First, you have to cut the book apart, and then separate the pages from the bindings. Older books are generally signature bound, and so it's simply a matter of cutting the backing off the signatures, cutting any strings holding them together, and then separating the signatures. This sounds easy, but it's a lot of work. If the book is perfect bound (i.e., individual pages are glued together, rather than signatures), it's just a matter of separating the pages very carefully a few pages at a time. First, you have to cut the book apart, and then separate the pages from the bindings. Older books are generally signature bound, and so it's simply a matter of cutting the backing off the signatures, cutting any strings holding them together, and then separating the signatures. This sounds easy, but it's a lot of work. If the book is perfect bound (i.e., individual pages are glued together, rather than signatures), it's just a matter of separating the pages very carefully a few pages at a time.
Second, you then have to trim the bound edges, so that the pages are separate. Perfectly bound books tend to have glue creeping up between each page, whereas signature bound books tend to have glue only creeping up between the signatures. I have paper trimmer that allows me to clamp the pages down so that they don't move as I cut them, and I highly recommend this. It also has a measured grid to the left, so that I can ensure I'm cutting all the pages to the same width. (\textbf{Protip:} Put a piece of painter's tape on the grid to make this even easier.) Second, you then have to trim the bound edges, so that the pages are separate. Perfectly bound books tend to have glue creeping up between each page, whereas signature bound books tend to have glue only creeping up between the signatures. I have a paper trimmer that allows me to clamp the pages down so that they don't move as I cut them, and I highly recommend something similar. It also has a measured grid to the left, so that I can ensure I'm cutting all the pages to the same width. (\textbf{Protip:} Put a piece of painter's tape on the grid to make this even easier.)
After that, separate your pages into groups that you will scan. This should be however many sheets your scanner can handle easily at one time, and will depend largely on the kind of paper the book was printed on. I generally find ten sheets work well, and make it easier for me to count. Smaller groups means more work up front, but it also means that it is easier to fix things when (not \textit{if}) something goes wrong. \begin{center}
\includegraphics[scale=0.5]{paper_cutter}
\end{center}
\begin{wrapfigure}[]{R}{0.6\textwidth} After that, separate your pages into groups of equal numbers of pages that you will scan. This should be however many sheets your scanner can handle easily at one time, and will depend largely on the kind of paper the book was printed on. I generally find ten sheets (i.e., 20 pages) work well, and make it easier for me to count. Smaller groups means more work up front, but it also means that it is easier to fix things when (not \textit{if}) something goes wrong.
% \raisebox{0pt}[\dimexpr\height-1\baselineskip\relax]{\includegraphics[scale=0.4]{number_your_sections}}
\vspace{-\intextsep} \begin{center}
\includegraphics[scale=0.45]{number_your_sections} \includegraphics[scale=0.5]{number_your_sections}
\end{wrapfigure} \end{center}
Number all of your groups with the filename they will eventually have. I use a pencil and mark this lightly (or not so lightly, depending on the day) in the lower right corner of the first page: Number all of your groups with the filename they will eventually have. I use a pencil and mark this lightly (or not so lightly, depending on the day) in the lower right corner of the first page:
This is all about workflow for me. Since my scanner (a Brother MFC-J8050DW) scans whatever is facing \textit{down} in the document feeder, after I scan the first side, I should see odd numbers facing up in the ADF. I then know that I need to scan the side that is now facing down, which means that I don't turn them over, I just rotate them 180\textdegree{} in the \textit{xy}-plane. This is all about workflow for me. Since my scanner (a Brother MFC-J8050DW) scans whatever is facing \textit{down} in the document feeder, after I scan the first (i.e., odd-numbered) side, I should see odd numbers facing up in the ADF. I then know that I need to scan the side that is now facing down, which means that I don't turn them over, I just rotate them 180\textdegree{} in the \textit{xy}-plane.
Most books have unnumbered pages. This should go without saying, but it's one of those things that you don't think about until after it becomes an issue: \textit{number all the blank pages:} Most books have unnumbered pages. This should go without saying, but it's one of those things that you don't think about until after it becomes an issue: \textit{number all the blank pages}. Again, I just use a pencil:
\begin{center} \begin{center}
\includegraphics[scale=0.5]{number_blank_pages} \includegraphics[scale=0.5]{number_blank_pages}
@ -135,7 +137,7 @@ Most books have unnumbered pages. This should go without saying, but it's one of
Once you do all of this, you're ready to scan. You should have a pile of stuff that looks something like this: Once you do all of this, you're ready to scan. You should have a pile of stuff that looks something like this:
\begin{center} \begin{center}
\includegraphics[scale=0.4]{ready_to_scan} \includegraphics[scale=0.5]{ready_to_scan}
\end{center} \end{center}
As it turns out, if you have a couple of pages that bleed to the middle\footnote{Generally because of a photograph or illustration that continues across both the left and right pages.}, you can pretend that they don't and simply trim those pages the same size as all your other pages. Or, if you want to preserve that bleed, you'll have to remove those sheets \textit{before} you trim the edges, and separate them very carefully down the middle. If you are lucky, they are in the middle of a signature, and you can separate them with a sharp knife. If you are not lucky, they will be somewhere else, and may have glue holding them together, meaning you have to very carefully prise them apart. No matter how carefully you do this, you will inevitably lose some data. As it turns out, if you have a couple of pages that bleed to the middle\footnote{Generally because of a photograph or illustration that continues across both the left and right pages.}, you can pretend that they don't and simply trim those pages the same size as all your other pages. Or, if you want to preserve that bleed, you'll have to remove those sheets \textit{before} you trim the edges, and separate them very carefully down the middle. If you are lucky, they are in the middle of a signature, and you can separate them with a sharp knife. If you are not lucky, they will be somewhere else, and may have glue holding them together, meaning you have to very carefully prise them apart. No matter how carefully you do this, you will inevitably lose some data.
@ -143,18 +145,20 @@ As it turns out, if you have a couple of pages that bleed to the middle\footnote
You will end up with two sheets (i.e., four pages) that are a different size, and which will need to be scanned separately. In which case, it's good to use a cheat sheet to keep track of which groups are which sizes and how many pages are contained in each. I like to just jot this down on an index card, but if the book you are scanning is complex, you'll need a bigger sheet of paper. You will end up with two sheets (i.e., four pages) that are a different size, and which will need to be scanned separately. In which case, it's good to use a cheat sheet to keep track of which groups are which sizes and how many pages are contained in each. I like to just jot this down on an index card, but if the book you are scanning is complex, you'll need a bigger sheet of paper.
\begin{center} \begin{center}
\includegraphics[scale=0.4]{cheat_sheet} \includegraphics[scale=0.5]{cheat_sheet}
\end{center} \end{center}
Now we are \textit{finally} ready to start scanning.
\section{What Does This Have to do With Linux?} \section{What Does This Have to do With Linux?}
You may be wondering why I am spending so much time talking about using scissors and pencils and rulers when this is a zine about Linux. Sure, this is what I need to do to get ready to scan and put all those scans together using the command line application \texttt{pdftk}, but what is the point here? You may be wondering why I am spending so much time talking about using scissors and pencils and rulers when this is a zine about Linux. Sure, this is what I need to do to get ready to scan and put all those scans together using the command line application \texttt{pdftk}, but what is the point here?
If you recall back in the first issue, I said that doing things on the command line makes you think about \textit{outcomes}. Thinking about outcomes doesn't matter just on the computer. Like I said earlier, there is no ``undo'' button in real life.\footnote{Although I sometimes think that lawyers are just rich people's undo buttons.} Once you've cut something apart, there's no putting it back together. You \textit{have} to think about what you want the next step of the process to be so that you don't do something in this step that makes the next step difficult or even impossible. You have to think ahead about what you want to end up with. If you recall back in the first issue, I said that doing things on the command line makes you think about \textit{outcomes}. Thinking about outcomes doesn't matter just on the computer. Like I said earlier, there is no ``undo'' button in real life.\footnote{Although I sometimes think that lawyers are just rich people's undo buttons.} Once you've cut something apart, there's no putting it back together. You \textit{have} to think about what you want the next step of the process to be so that you don't do something in this step that makes the next step afterward difficult or even impossible. You have to think ahead about what you want to end up with. You have to know what you want.\footnote{And here I readily admit that you are probably going to do this a few times before you figure out a process for getting to that point. The workflow I laid out in section 2.1 took a few books to develop.}
I sometimes think that a GUI is like the menu at the McDonald's drive through.\footnote{It's no wonder that the menu in a GUI is called a menu, when you think about it.} If you are lucky, you are behind the person that knows what they want. They planned ahead. They thought about the outcome they wanted (full stomach, happy taste buds) and chose something ahead of time that would get them to that outcome. But a lot of people (too many people, in my opinion\footnote{Just one of many reasons I don't eat fast food any more.} get up to that order screen and \textit{that's} when they decide to start thinking about outcomes. They are so used to seeing a menu in front of them that they can't even begin making a decision without seeing it. I sometimes think that a GUI is like the menu at the McDonald's drive through.\footnote{It's no wonder that the menu in a GUI is called a \textit{menu}, when you think about it.} If you are lucky, you are behind the person that knows what they want. They planned ahead. They thought about the outcome they wanted (full stomach, happy taste buds) and chose something ahead of time that would get them to that outcome. But a lot of people (too many people, in my opinion\footnote{Just one of many reasons I don't eat fast food any more.} get up to that order screen and \textit{that's} when they decide to start thinking about outcomes. They are so used to seeing a menu in front of them that they can't even begin making a decision without seeing it.
And let's face it: the menu at McDonald's has not really changed in years. Yes, they have new things, but they also advertise the hell out of them when they do. How can you \textit{not} know about their new menu item if you watch more than 30 minutes of television a day? But again, a GUI does not encourage you to think. The command line does. And let's face it: most people just don't like to think. They like the \textit{illusion} of choice, and that is what substitutes for thinking most of the time. ``What are you getting at McDonald's?'' is too often followed by ``I don't know; I'll think about it when we get there.'' And let's face it: the menu at McDonald's has not really changed in years. Yes, they have new things, but they also advertise the hell out of them when they do. How can you \textit{not} know about their new menu item if you watch more than 30 minutes of television a day? But again, a GUI does not encourage you to think. The command line does. And again: most people just don't like to think.\footnote{I admit, I like to think, but I don't like to think \textit{all the time}. Sometimes my brain needs a rest, so I make some popcorn and pull up something corny to watch on television for a while. But then my brain starts wandering, and I know it's time to get back to work.} They like the \textit{illusion} of choice, and that is what substitutes for thinking most of the time. ``What are you getting at McDonald's?'' is too often followed by ``I don't know; I'll think about it when we get there.''
Of course, if you are thinking about outcomes, chances are you don't eat fast food very often anyway, because the long term outcomes are obesity, heart disease, and hypertension. But damn, those fries are good! Of course, if you are thinking about outcomes, chances are you don't eat fast food very often anyway, because the long term outcomes are obesity, heart disease, and hypertension. But damn, those fries are good!
@ -162,7 +166,9 @@ Of course, if you are thinking about outcomes, chances are you don't eat fast fo
For what it's worth, there is a GUI for \texttt{pdftk}. It's called PDF Chain and you can find it at \href{https://pdfchain.sourceforge.io/}{\texttt{https://pdfchain.sourceforge.io/}}. For what it's worth, there is a GUI for \texttt{pdftk}. It's called PDF Chain and you can find it at \href{https://pdfchain.sourceforge.io/}{\texttt{https://pdfchain.sourceforge.io/}}.
Despite all my prattling on about the many advantages the command line has for your brain, I'm not opposed to using a GUI, actually. (I mean, I have Ubuntu installed on two machines and Kubuntu on a third.) A GUI does make life easier in many ways, and what I like about one in a case like this is that if you're someone who has to manipulate pdf files rarely or only once, it's probably easier to just use a GUI than it is to learn the command line. Efficiency plays a role here, as well. If I'm going to use this all the time, it's definitely more efficient for me to learn the command line approach. But once or twice a year? Or only once? A GUI is much more efficient. Despite all my prattling on about the many advantages the command line has for your brain, I'm not opposed to using a GUI, actually. (I mean, I have Ubuntu installed on two machines and Kubuntu on a third.) A GUI does make life easier in many ways, and what I like about one in a case like this is that if you're someone who has to manipulate pdf files rarely or only once, it's probably easier to just use a GUI than it is to learn the command line. Efficiency plays a role here, as well. If I'm going to use this all the time, it's definitely more efficient for me to learn the command line approach. But once or twice a year? Or only once ever? A GUI is much more efficient.
\textbf{tl;dr:} If you're only going to use a tool once, there's no issue with using the simplest tool required to get the job done.
\begin{center} \begin{center}
\includegraphics[scale=0.4]{pdfchain_-_title} \includegraphics[scale=0.4]{pdfchain_-_title}

BIN
002/images/paper_cutter.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 136 KiB