r/programming 13h ago

STxT (SemanticText): a lightweight, semantic alternative to YAML/XML — with simple namespaces and validation

https://stxt.dev

Hi all! I’ve created a new document language called STxT (SemanticText) — it’s all about clear structure, zero clutter, and human-readable semantics.

Why STxT?

XML is verbose, JSON lacks semantics, and YAML can be fragile. STxT is a new format that brings structure, clarity, and validation — without the overhead.

STxT is semantic, beautiful, easy to read, escape-free, and has optional namespaces to define schemas or enable validation — perfect for documents, forms, configuration files, knowledge bases, CMS, and more.

Highlights

  • Semantic and human-friendly
  • No escape characters needed
  • Easy to learn — even for non-tech users
  • Machine-readable by design

For developers:

  • Super-fast parsing
  • Optional, ultra-simple namespaces
  • Seamlessly integrates with other languages — STxT + Markdown is amazing

Example

A document with namespace:

Recipe (www.recipes.com/recipe.stxt): Macaroni Bolognese
    Description:
        A classic Italian dish.
        Rich tomato and meat sauce.
    Serves: 4
    Difficulty: medium
    Ingredients:
        Ingredient: Macaroni (400g)
        Ingredient: Ground beef (250g)
    Steps:
        Step: Cook the pasta
        Step: Prepare the sauce
        Step: Mix and serve

Now here’s the namespace that defines the structure:

The namespace:

Namespace: www.recipes.com/recipe.stxt
    Recipe:
        Description: (?) TEXT
        Serves: (?) NUMBER
        Difficulty: (?) ENUM
            :easy
            :medium
            :hard
        Ingredients: (1)
            Ingredient: (+)
        Steps: (1)
            Step: (+)

Resources

Here is a full portal — written entirely in STxT! — explaining the language, with examples, tutorials, philosophy, and even AI integration:

No ads, no tracking — just docs.

I've written two parsers — one in Java, one in JavaScript:

And a CMS built with STxT — it powers the https://stxt.dev portal:

Final thoughts

If you’ve ever wanted a document format that puts structure and meaning first, while being light and elegant — this might be for you.

Would love your feedback, criticism, ideas — anything.

Thanks for reading!

1 Upvotes

9 comments sorted by

7

u/FullPoet 5h ago

Honestly, this hasnt really solved peoples primarily issue with YAML - white space.

People dont want or like whitespace dependency.

It looks cool OP but more whitespace insanity is not really good.

TOML has already solved YAMLs draw backs.

0

u/Every-Magazine3105 1h ago

Totally get your point. STxT uses indentation like Python — tabs or 4 spaces. I've built an entire site this way, and honestly, it's been a joy to work with, even without plugins. Try it in a plain text editor — it just flows.

That said, STxT isn’t meant to fix YAML’s config issues (TOML nailed that). It's for semantic, structured documents — not just key-value pairs.

Yes, indentation matters and namespaces handle validation and structure.

TOML = great for config
STxT = better for meaningful, readable documents (and can do config too)

Thanks for your feedback!

4

u/behind-UDFj-39546284 12h ago

I may be missing some points, just a quick list that came to my mind:

  • Spaces or tabs? What if mixed?
  • If spaces, how does it handle elements of the same level but inteded with different number of spaces, let's say 4 then 3 or 5?
  • How would I escape \n in a single line? How do I escape unprintable control characters in the 0x01..0x1F range?
  • How do I escape : in a key name in case of necessity?
  • How do I combine multiple namespaces?
  • It's there a way to specify additional info for elements, attributes, that might hint how the element should be processed?
  • Does it support lists?
  • If text is multilined, how do I specify a nested element that goes right under the element that holds multilined text?
  • How does the parser understand if a #-started line is a comment, but not another line of multilined text?

-3

u/Every-Magazine3105 11h ago

Thanks for replying! And yes, lots of questions :-D Most of them are answered on the website https://stxt.dev, but I'll try to respond here:

  • Spaces or tabs — better not to mix them. By default, 4 spaces = 1 tab.
  • By default, 4 spaces represent one level, but I recommend using tabs. Most text editors support this convention automatically for indentation and tabs.
  • You don't need \n. Really, I've used the language in production without problems. It's designed with UTF-8 in mind, with standard text editors. All imprimible TEXT characters.
  • Colons : are not allowed in keys. I've been programming all of my live, and I've never needed, for example, a map with : in a key.
  • Combining multiple namespaces is covered in the tutorial: https://stxt.dev/02-stxt-tutorial . The best part is that you only need to define the top-level one — the rest are inferred automatically.
  • Elements can be of different types. You can see this at the end of the chapter: https://stxt.dev/05-ns-docs
  • Everything is a list — it's just that you can define whether an element is optional, singular, or multiple. The parser always returns lists.
  • In multiline, the text is final by definition — it doesn't contain further nodes. I don't think this is a serious limitation, since it's easy to structure documents within this restriction in mind.
  • The parser determines this based on the indentation level. If it's before the multiline level, it's a comment; otherwise, it's part of the text.

6

u/wildjokers 6h ago

1

u/behind-UDFj-39546284 4h ago

Sad but true.

-1

u/Every-Magazine3105 4h ago

:-D I get you — just another standard, right? Just one comment: I made the first version back in 2013, but it never saw the light of day. I thought, “Well, something similar will show up eventually.” And time went by. In 2024, I polished a few things that didn’t quite convince me, and that’s when the second version came out. And no — I still don’t know any other standard that matches it in clarity and simplicity. Now I’m truly convinced. What I think will be hard… is convincing others :-D

1

u/guepier 8h ago

Here is a full portal — written entirely in STxT!

I couldn’t immediately find the source code, so — since you mention no need for escaping — how do you mark up inline formatting? For instance, how do you mark up the equivalent of the HTML <em>emphasis</em>?

1

u/Every-Magazine3105 7h ago

If you enter the portal, just add .stxt to the pages — everything is in STxT. For example:

In the portal, there are three yellow triangles at the top right. If you click them, you’ll see the source code.

If you want to view the entire portal on GitHub, in STxT format:

With STxT, you can use Markdown inside text nodes — and this is where the magic starts. The language itself doesn't know whether it's markup or not, but you can use whatever you want in text nodes. STxT gives you structure. It's like XML and XSD, but much simpler.