Digitizing your papers, literally, for the future, with 4K video


I have so much paper that I've been on a slow quest to scan things. So I have high speed scanners and other tools, but it remains a great deal of work to get it done, especially reliably enough that you would throw away the scanned papers. I have done around 10 posts on digitizing and gathered them under that tag.

Recently, I was asked by a friend who could not figure out what to do with the papers of a deceased parent. Scanning them on your own or in scanning shops is time consuming and expensive, so a new thought came to me.

Set up a scanning table by mounting a camera that shoots 4K video looking down on the table. I have tripods that have an arm that extends out but there are many ways to mount it. Light the table brightly, and bring your papers. Then start the 4K video and start slapping the pages down (or pulling them off) as fast as you can.

There is no software today that can turn that video into a well scanned document. But there will be. Truth is, we could write it today, but nobody has. If you scan this way, you're making the bet that somebody will. Even if nobody does, you can still go into the video and find any page and pull it out by hand, it will just be a lot of work, and you would only do this for single pages, not for whole documents. You are literally saving the document "for the future" because you are depending on future technology to easily extract it.

If you want to do this, here are some tricks to make it more likely to work:

  • Use really bright light, as bright as you can. This means the exposures will be shorter (eliminating the blur from the fast movement of the pages) and the depth of field will be deeper, making sure all stays in focus. Make the light so bright you gotta' wear shades, or even do it in sunlight.
  • At the same time, you want even light (multiple sources) and if you are scanning glossy pages you want to avoid glare.
  • Consider setting manual focus, as long as you have good depth of field. That way your camera won't start doing a focus hunt, which could leave some of your pages out of focus when shot. Unless you have a lens that does this, remember to re-focus when you change the zoom for different sized documents.
  • Consider a custom or fixed white balance, though that doesn't matter for text so much. Also, the future software can fix a wrong white balance.
  • Consider manual exposure, set to not quite overexpose white paper on your table. (If using sunlight, you may need to adjust from time to time.)
  • Do experiments at different exposures, F-stops and ISOs to find the combination that keeps pages sharp and not grainy. Sharp is probably more important, so you might end up bumping ISO -- but it's better to just add more light.
  • On the table, put a sheet of some unusual colour. Real movie greenscreen is cheap online. This will help the future software remove the background.
  • Get a remote control or cable release to make it easy to start and stop video recording.
  • You can either pile the pages on top of one another until the pile gets thick, or place a pile on the table and remove them one by one. Not sure which will do better.
  • Alternately, have two people, one putting down pages, the other pulling them away.
  • Watch to make sure that for every page, there is a point where your hand is not making a shadow on the pages, or blocking the camera. It's OK if your thumb stays on the edges of the paper to hold it down. There is already software to remove things like thumbs from the edge of a paper image.
  • You can also do bound books. With 4K doing 2 pages at a time is tolerable, but alternately you can do one side and then the other. Pro book scanners have a "V" shaped mount, and two cameras, one pointed at each side, and they just flip the pages.
  • This might seem silly, but consider making a distinct sound, like literally saying the word "beep" as you lay down each page.
  • You can also do physical objects, holding them in front of the camera and rotating them on all 3 axes, making sure you constantly move your fingers. Software will remove your fingers but you want many frames of every spot with no finger on it.
  • It's probably OK to record at a lower frame rate, but storage is cheap, so why bother?

Two Cameras

Cameras and storage are cheap, so consider having two cameras recording the table. This will allow them to get 3-D shapes and also remove glare. From 2 cameras, the software will be able to know and remove the curvature of pages when scanning bound books or pages that don't stay flat. This will also do much better on 3-D objects you want to scan. If you want to get fancy, you could get a small turntable, but those are mostly meant for cameras on the side. If you want to scan grandma's large knicknack collection it could be worth it.

So how good will it be?

I think the results will be excellent. Particularly if the main goal is OCR, or you do it with two cameras. 4K is 3840 wide, which is better than 300 pixels/inch on an 8.5 x 11 page of paper. You'll probably zoom out a bit, however, and get a little less. Many document scanners don't scan at this resolution.

On top of that the video will contain several frames of every page, including ones where the page is moving. As long as it's not blurry, this allows "sub-pixel" resolution by combining frames -- the result could well approach 400 to 600 pixels/inch. The main way you'll be inferior to a document scanner is they apply perfect even lighting to every page, and you will have shadows and glare. That's why you want to pay attention to your lighting. Review a few snaps (or what you see in the preview screen of the camera) to make sure it's good.

While today's software won't decode this video, even today you could easily make a tool to extract frames and do OCR on them. And while it doesn't yet exist, a tool to detect when a new page appears and stays still to extract frames for OCR is pretty easy to make. So it's not at all a future science fiction technology to get something quite good from your video. The future technology is the tool that gets something as good as, or better than a pro document scanner. If you make a sound with every page, it makes it even easier to write a program that simply grabs each frame synchronized to the sound and does basic cropping, rotation and OCR. (On greenscreen, the crop and rotate are trivial, there are already tools to do that.)

Scanning this way will be superior to the pro document scanner in many ways. It will be faster in most cases, often much faster. And it will be more reliable. The big problem with pro document scanners is they jam and skip pages on anything but nicely collated stacks. If you want to take a fresh printout or a book with the spine chopped off, the document scanner is reliable and fast -- though you need to "prep" most documents to make sure they feed well. But for a stack of papers, especially stacks that are uneven or have papers of different sizes, this will be much faster.

This method would also actually be tolerably good at scanning photographic prints, though they are usually glossy which presents challenges. If you can keep them to a fixed area and zoom the camera, you'll pull out pretty much all the info in a typical drugstore print from the old days.

Two sided

Many documents are two sided. There are a few options. You could flip every page on the table after putting it down. You could take stacks and flip the whole stack over and repeat the process. Whichever is easiest for you. The former will be slightly easier for automated software to collate.

One might imagine doing this on a clear glass table with cameras underneath. That would work, except the light from the other side would interfere. This approach would be best if you were lighting the tables with a strobe, and the strobe blinked on top when those cameras were going, and then below when those cameras were going. That's a design for a more pro version of this.

A pro table

There are more advanced features that a table designed for this as a product could have. Such a table would figure out when there is a new page and only store the page, not a video. It would store lossless if you wanted. It might have suction holes to keep pages flat or two sided imaging. It could do many things. This post is about what you can do with tools you probably already have or can buy cheap -- a newer digital camera, a tripod or mount, some lamps and your body.

Let us know in the comments the results of any experiments you do. While this is done in anticipation of future technology, the reality is you can look at the video and figure out today if you got a good enough scan, and the manual approach is always there if it's important.

And yes, I have of course read the book Rainbow's End. (Vernor Vinge and I have worked together in the past on e-Books.) In Rainbow's End, they digitize a library by slicing up all the books and blowing the pages through a tube full of cameras. Some pages will get missed, but as more libraries are done with multiple copies of the book, you will get them all. Vernor did this deliberately to make all book lovers cringe at the thought of destroying books to scan them, but the principles are similar, and this method is not destructive.


I have used video to capture microscope images for focus stacking.

If I did my calculations correctly HD video captures more resolution
than the microscope delivers.

I have used ffmpeg to split video into frames, and ZereneStacker ($) and the ImageJ
extended depth of field plugin (free) to do the focus stacking.

This is pretty much the standard for astrophotos now, to shoot video and stack images to remove noise. But both this and focus stacking are different from the goal here, which is to make it as easy and quick as possible to scan your thousands of documents in a way you can feel comfortable throwing the originals away.

This made me think of the scanning solution used in (IIRC) Vernor Vinge's Rainbow's End (my apologies if I have the wrong book/author). The task was to digitize a library and the solution was to throw the books into a mulcher with a lot of high resolution cameras taking video of the small pieces coming out and to stitch the pieces all back together digitally. No need to even open the books!

I describe that in the post, was there something more?

No, sadly just yet another instance of my not reading thoroughly- sorry.

I didn't see Brad's reference to Rainbow’s End either because I (also) stopped reading after he wrote "Let us know in the comments the results of any experiments you do." That sounds like a conclusion. :-)

Otherwise it was a great column!


I led the design + build on a very similar rig last year, using the following components:

-- Sony QX100 Wireless Camera (20MP | F1.8)

-- 3D printed curved arm positioned to capture 8.5x14 or 8.5x17 images.

-- Touch Sensitive Windows .NET App with scikit-image (for thresholding, edge detection, compression etc.) and Sony's SDK/API for QX100.

-- I used a Microsoft Surface Pro 3, but one can pull this off on an Android Tablet or iPad (iPad Pro would be my choice today).

-- Placement mat + guide for positioning the paper where it is expected.

Drawback: QX100 uses WiFi-Direct to bind with the touch device controlling it. Thus the control device will need ethernet or a second WiFi adapter to handle the QX100 and Internet connectivity simultaneously. Alternatively, the QX100 does WiDi and the controller does LTE (works nicely).

Advantage: I can bring the touch device into proximity with the scan pod whenever I desire, for the express purpose of scanning. Thus, all you see when the device is not in use, is the scan arm (the camera disappears neatly into the arm).

It works very, very well, with lightning fast AF and a beautiful lens. Yet the major labor component remains: someone still has to stack, position and flip pages. So I consider it a partial success.

Similar to the device I describe in the previous article linked at the top of the post here.

The idea of the 4K video is to make something that is super fast and easy once you set it up. Obviously a dedicated device that was that easy to use and also produced a well cropped document from day one is even better.

Scanbot on iOS supposedly has a mode called "Continuous Scan" which does this. I haven't tried it though.

Add new comment