SESHAT — an honest computational-epigraphy platform

Seshat — Egyptian goddess of writing, knowledge, and measurement.

Tools for ancient scripts that never claim more than the evidence supports. Deciphered scripts get real tooling; undeciphered ones get honest limit-analysis; nothing is ever "deciphered", "translated", or assigned phonetic values by SESHAT. The contract is enforced in code, not just in prose.

The flagship study is Linear A — a measurement, not a decipherment — and every method is calibrated on Linear B (deciphered Mycenaean Greek): if a technique can't recover what we already know about Linear B, we don't trust it on Linear A. A rigorous negative is a valid headline.

Flagship result — Linear A, honestly

Phase	Question	Result
1 — Information limit	How much structure does the corpus hold?	Linear A `H(next\|prev) ≈ 3.7 bits`, redundancy 31% vs Linear B 63% — far sparser, quantifying why it resists decipherment. Entropy-rate analysis: reliable only to n=2; n≥3 is undersampling (even for Linear B here)
2 — GNN sign embeddings	Does co-occurrence encode phonology?	On Linear B, recovers vowels (+8–11 pts, null-controlled `z≈4`) but not consonants (null)
3 — Linear B → Linear A transfer	Do shared signs carry their Linear B vowels?	No (`\|z\|<2`) → Linear A does not distributionally mirror Linear B — consistent with a different language
+ Typology	Is Linear A statistically like any comparandum?	Size-matched fingerprints (bootstrap): Linear A is robustly nearest Linear B — but that is writing-system kinship (both Aegean syllabaries), not a shared-language claim
+ Positional	Do signs specialise by position?	Yes (calibrated on Linear B); Linear A shows real positional structure — a measurement, not a reading

The multi-script platform

SESHAT is now a per-script platform (ADR-0003), with the honesty contract modelled on two independent axes — because they genuinely come apart:

script readable? — are the sign values known? · language understood?

	readable	language	what SESHAT does
Egyptian, cuneiform, Phoenician, Linear B, …	✓	✓	real sign tooling + analysis
Meroitic	✓	✗	sign tooling, flagged language-undeciphered
Linear A	✗	✗	limit-analysis only — no value assigned to any sign

16 scripts. Deciphered scripts get sign-inventory tooling straight from the Unicode standard (unicodedata — nothing hand-typed): Egyptian hieroglyphs (Gardiner codes), Sumero-Akkadian cuneiform (readings + composites), and a generic tool covering Linear B syllabary, Anatolian (Luwian) hieroglyphs, Phoenician, Ugaritic, Old Persian, Old Turkic, Gothic, Old Italic, Imperial Aramaic, Carian, Lycian, Lydian, and Meroitic. The registry routes every request and refuses to break the contract (decipher/translate/assign-values raise for every script; sign tooling is refused for unreadable ones).

from seshat_analysis.registry import route, tooling
route("linear_a", "decipher")        # -> ContractError (never, for any script)
route("linear_a", "sign_inventory")  # -> ContractError (Linear A is not readable)
route("linear_a", "info_limit")      # -> OK (limit-analysis)
tooling("egyptian")()                # -> 1071 hieroglyphs with Gardiner codes

How it works (each language where it fits)

component	language	role
`seshat-analysis/`	Python	info-limit, block-entropy, GNN embeddings, typology, positional, Unicode sign tooling, the registry, null-model controls
`seshat-core/`	Rust	corpus parser, sign inventory, bigram matrices
`seshat-anneal/`	C++/CUDA	QUBO annealer — a future refinement layer (Phase 4), synthetic-only, not a decipherment claim
`seshat-viz/`	Rust/egui	interactive sign tables and heatmaps

Reproduce

cd seshat-analysis && pip install -e .
pytest                                              # ~78 tests
python -m seshat_analysis.info_limit       --data ../data/corpus   # Phase 1
python -m seshat_analysis.typology         --data ../data/corpus   # comparative typology
python -m seshat_analysis.positional       --data ../data/corpus   # positional structure
python -m seshat_analysis.egyptian                                 # hieroglyph inventory
python -m seshat_analysis.registry                                 # the multi-script contract

Everything is seeded, deterministic, offline, and CPU-only.

Honesty contract

Linear A is not deciphered here; no phonetic value is asserted for any undeciphered sign. Deciphered scripts are ground truth; undeciphered ones are measurement, never announcement. Methods are trusted only after recovering known structure on a deciphered script and surviving a null-model control. The contract is enforced structurally by the registry.

Data & provenance

Linear A: John Younger's Linear A Database (Univ. of Kansas)
Linear B: Ventris & Chadwick; Duhoux & Morpurgo Davies; sign values from the authoritative Unicode Linear B Syllabary names (not hand-typed)
Comparanda: Luwian, Hurrian
Sign inventories: the Unicode standard, via unicodedata

Author

Antonio Zambudio Rodriguez

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
.github/workflows		.github/workflows
arxiv_submission		arxiv_submission
data		data
docs		docs
figures		figures
paper		paper
scripts		scripts
seshat-analysis		seshat-analysis
seshat-anneal		seshat-anneal
seshat-core		seshat-core
seshat-viz		seshat-viz
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
candidate_distributions.json		candidate_distributions.json
compile_paper.sh		compile_paper.sh
qubo_real.json		qubo_real.json
result_qa_gorila_quick.json		result_qa_gorila_quick.json
result_qa_real.json		result_qa_real.json
result_qa_trace.json		result_qa_trace.json
result_sa.json		result_sa.json
result_sa_gorila.json		result_sa_gorila.json
result_sa_gorila_quick.json		result_sa_gorila_quick.json
result_sa_real.json		result_sa_real.json
result_sa_trace.json		result_sa_trace.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SESHAT — an honest computational-epigraphy platform

Flagship result — Linear A, honestly

The multi-script platform

How it works (each language where it fits)

Reproduce

Honesty contract

Data & provenance

Author

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

SESHAT — an honest computational-epigraphy platform

Flagship result — Linear A, honestly

The multi-script platform

How it works (each language where it fits)

Reproduce

Honesty contract

Data & provenance

Author

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages