Complete Digitization of Leonardo da Vinci's Codex Atlanticus

WillAdams · 2025-10-28T12:13:01 1761653581

Interesting UI --- wants a full-screen mode and 2-up view and a way to remove all the chrome/UI....

An earlier example of this sort of thing was Bill Gates' purchase of the Codex Leceister https://en.wikipedia.org/wiki/Codex_Leicester which was then digitized and released on a CD-ROM by Corbis:

https://en.wikipedia.org/wiki/Leonardo_da_Vinci_(video_game)

which was quite engaging, but sadly trapped in the technology of the time --- anyone know of an updated version of it?

mzs · 2025-10-28T14:28:55 1761661735

https://codex-atlanticus.ambrosiana.it/

felixbraun · 2025-10-28T18:53:41 1761677621

Excellent work — reminds me of similar projects built with the same tech stack:

– Coins: A journey through the Münzkabinett Berlin collection (one of the largest in the world). https://uclab.fh-potsdam.de/coins/

– Theodor Fontane Marginalia: A visualization of Fontane’s marginalia and notes in his personal library. https://uclab.fh-potsdam.de/ff/

Isamu · 2025-10-28T11:34:34 1761651274

If you get an opportunity to see them in person, it’s worth it because the fine details are that much more impressive up close. Every photo I’ve seen is not as good. Also the illustration is tinier than you would think.

whatever1 · 2025-10-28T09:41:28 1761644488

How much talent can fit in a person? This is how much.

nunodonato · 2025-10-28T10:46:58 1761648418

indeed! The biography of Leonardo was an amazing read. Highly recommend it

proee · 2025-10-28T13:00:01 1761656401

Can you recommend the author?

nunodonato · 2025-10-28T14:37:55 1761662275

https://en.wikipedia.org/wiki/Leonardo_da_Vinci_(Isaacson_bo...

The same author who wrote some other famous biographies. I know some people prefer other DaVinci's biographies. I didn't read others to be able to compare, but I really enjoyed this one.

kragen · 2025-10-28T15:47:00 1761666420

Nitpick: "da Vinci" wasn't our homeboy's name. That just means "from Vinci". He was "Leonardo", like many other people, so we added "da Vinci" to clarify which Leonardo we meant, just like you might say, "Jessica from church came by," to clarify that you didn't mean Jessica the ex-girlfriend. Surnames weren't very widely used in Italy then.

It's like "Jesus of Nazareth"; you wouldn't talk about "other OfNazareth's biographies". Ain't grammatical.

cma · 2025-10-28T17:23:43 1761672223

It's fine. John Smith once meant the John who works as a blacksmith etc. Whatever the original meaning we now widely take da Vinci to be the last name if we don't speak Italian.

kragen · 2025-10-28T17:43:55 1761673435

I agree that the error is common. Try to make new errors instead of repeating common errors.

card_zero · 2025-10-28T18:56:18 1761677778

Does this also apply to DiCaprio? His name seems to translate as "the deer's Leonardo", or maybe "the goat's Leonardo". Possibly "son of a goat".

Wikipedia says that Leonardo da Vinci was properly Leonardo son of Piero from Vinci son of Antonio son of another Piero son of Guido. I'm not sure that moving to surnames was a mistake, you know.

kragen · 2025-10-28T19:04:35 1761678275

Nope, that's his actual surname. He wasn't born in the 16th century.

card_zero · 2025-10-28T19:18:07 1761679087

But at some point back in time, when an ancestral DiCaprio was first referred to as just "DiCaprio", that was an error, right? He should properly be called Quello Figlio di Caprio, that son of a goat. It's not too late.

kragen · 2025-10-28T20:18:04 1761682684

Probably not, no, and AFAIK Leo is a nice guy who doesn't deserve to be deprecated in that way.

kragen · 2025-10-28T05:56:31 1761630991

This is beautiful! I am having some difficulty with the UI; is there a torrent? Images like https://codex-atlanticus.ambrosiana.it/assets/500/000R-1.jpg are too low in resolution for good archival; you can't even read the writing.

trvz · 2025-10-28T06:08:32 1761631712

Manipulate the URL for a higher resolution:

  https://codex-atlanticus.ambrosiana.it/assets/2000/000R-1.jpg

You don't need to depend on others to create a torrent, as bestowed upon you was the power of wget!

  wget https://codex-atlanticus.ambrosiana.it/assets/2000/000R-{1..1119}.jpg
  wget https://codex-atlanticus.ambrosiana.it/assets/2000/000V-{1..1119}.jpg

kragen · 2025-10-28T15:31:29 1761665489

Thanks! On my cellphone not even enough of the UI was working for me to discover those URLs. I suspect a certain amount of error recovery is in order for wgetting all 2238 images. 2000 seems to be the maximum resolution available, which is under 100dpi. A few of the images seem to have been uploaded to https://commons.wikimedia.org/wiki/Category:Codex_Atlanticus.

There are a couple of scans of a 43-page Italian edition published by Ulrico Hoepli on the Archive: https://archive.org/details/codex-atlanticus-leonardo-da-vin... https://archive.org/details/codex-atlanticus-leonardo-da-vin... but they seem to be of very poor quality.

I'm done downloading now (with a sleep of 1 second between pages), and I have 1064125470 bytes of JPEG files, a very reasonably torrentable size. I'll see if I can put together a torrent and upload to the Archive and Commons...

WithinReason · 2025-10-28T10:35:58 1761647758

Or in PowerShell on Windows:

  1..1119 | % { iwr "https://codex-atlanticus.ambrosiana.it/assets/2000/000R-$_.jpg" -OutFile "000R-$_.jpg" }
  1..1119 | % { iwr "https://codex-atlanticus.ambrosiana.it/assets/2000/000V-$_.jpg" -OutFile "000V-$_.jpg" }

embedding-shape · 2025-10-28T13:59:55 1761659995

Some people around me swear PowerShell has better user experience than unix shells, but then I keep seeing examples like these. How on earth could people prefer this compared to `wget https://codex-atlanticus.ambrosiana.it/assets/2000/000V-{1.....`?

kragen · 2025-10-28T15:32:08 1761665528

In this case presumably the main difference is not PowerShell vs. bash but iwr vs. wget? Because I think this is roughly equally bad (untested):

    for page in {1..1119}; do
        iwr "https://codex-atlanticus.ambrosiana.it/assets/2000/000R-$page.jpg" -OutFile "000R-$page.jpg"
        iwr "https://codex-atlanticus.ambrosiana.it/assets/2000/000V-$page.jpg" -OutFile "000V-$page.jpg"
    done

Also until recently bash didn't have {42..53} syntax. You had to use `seq`. There was an alternative name for `seq` in Unix Power Tools, `jot`, because it wasn't standard: https://docstore.mik.ua/orelly/unix/upt/ch45_11.htm. This section was by ORA author and sysadmin Linda Mui (https://www.oreilly.com/pub/au/268), but I don't know if she wrote `jot` or just popularized it.

NoMoreNicksLeft · 2025-10-28T06:54:04 1761634444

Any idea on how to best compile it to an ebook? Just stuffing the jpgs into a pdf rarely works well...

foofoo12 · 2025-10-28T10:21:24 1761646884

I usually do what rarely doesn't work well for you, but it works decently for me. You get 1 page per image and the image isn't compressed or touched at all.

  apt install img2pdf
  img2pdf *.jpg -o leonardo-da-book.pdf

nunodonato · 2025-10-28T10:42:06 1761648126

wouldnt this mess up the order? I think you are supposed to view it like R1, V2, R2, V2, etc

foofoo12 · 2025-10-28T10:51:27 1761648687

Yes, this was just an example. Using wildcard expansion will give you whatever order the your current shell seems fit. Bash does alphabetical order.

kragen · 2025-10-28T16:14:22 1761668062

More like

    echo $(for page in {1..1119}; do for side in R V; do
      echo "000$side-$page.jpg"; done; done)

ticulatedspline · 2025-10-28T18:23:42 1761675822

Easy way would be to just drop them in a zip and label it .cbz. Most readers handle CBR/CBZ just fine.

kragen · 2025-10-28T19:05:21 1761678321

Oh, is .cbz that simple? Does it use the file order of the zipfile members or some other order? (https://acbf.fandom.com/wiki/ACBF_Editor_-_Creating_Metadata says it uses alphabetical order, which is the wrong order in this case.)

It may be useful to use zip -Z store. JPEG data isn't going to get much benefit from another layer of LZ77.

c0balt · 2025-10-28T07:42:55 1761637375

I haven't that done this in some time, but templating some markdown code for pandoc and creating an ebup might be a viable avenue.

kragen · 2025-10-28T16:16:41 1761668201

Maybe what rarely works well for NoMoreNicksLeft is having a gigabyte of JPEGs in a single HTML chapter inside the epub? In that case you could do something like divide the files into 373 "chapters" of 6 pages each?

One of the fragmentary editions I linked on the Archive uses the .cbr Comic Book Reader format; perhaps that is a better format than .epub for high-resolution scans of every page?

NoMoreNicksLeft · 2025-10-28T13:17:10 1761657430

Oooh... I have even less luck with epub, when the pages are an image-per-page.

atoav · 2025-10-28T12:32:24 1761654744

Calibre comes with a ebook-convert command, that one might work

eMPee584 · 2025-10-28T09:38:54 1761644334

ocrmypdf (rocks!)

nunodonato · 2025-10-28T10:39:56 1761647996

amazing! The categorization is nice, but I would love to see some sort of "tag cloud" that would allow use to view more specific content. How long until someone creates a tool to RAG the hell out of this? :)

vim-guru · 2025-10-28T07:52:32 1761637952

Why are some of the pages upside down?

embedding-shape · 2025-10-28T14:01:54 1761660114

It's a bit bananas, but probably just because he could. He also wrote his personal notes in "mirror writing":

> The notes on Leonardo da Vinci's famous Vitruvian Man image are in mirror writing. Leonardo da Vinci wrote most of his personal notes in mirror writing, only using standard writing if he intended his texts to be read by others

https://en.wikipedia.org/wiki/Mirror_writing

foofoo12 · 2025-10-28T10:22:08 1761646928

Da Vinci was showing off.

b34k3r · 2025-10-28T09:36:40 1761644200

just rotate your monitor

MangoToupe · 2025-10-28T15:56:36 1761666996

> We use it to express mild surprise that one person could use both their left and right hemispheres equally well.

When did this myth become so perpetuated? It's infuriating. I blame university administration. I can't think of any other reason to so firmly distinguish different areas of thought.

NaomiLehman · 2025-10-28T17:11:51 1761671511

I'm training a model based on this /s