home.social

Search

1000 results for “lexer”

  1. Wîhennahtwoche! Das heißt wîenechtfîrtage inklusive wîenahtâbent und wînahtnaht. Vielleicht auch mit wîenachtbrôt und weinachtkaese?

    Diese und weitere weihnachtliche Lemmata findet man (auch über API) in den Mittelhochdeutschen Wörterbüchern woerterbuchnetz.de/ #Lexer #BMZ #Findebuch #LODvent

  2. Anyone have experience with formal #lexer / #parser libraries in #python? (e.g. lark)

    I have a feature request for jc to parse the output of scutils and ipconfig on macOS (not Windows ipconfig), but it looks like a formal grammar to me - not something you could parse with a simple custom parser or regex.

    github.com/kellyjonbrazil/jc/i

  3. One of my favorite #lexer techniques is, if the host language supports it, to write a big regex with named capture groups and abuse leftmost-longest semantics and zero-width assertions to iterate over all the tokens in the input.

    It can get unwieldy for some things, but a lexer for Manatee is a single 33 line regex.

    (Obviously the regex can't, e.g., convert numeric lexemes to numeric type, but for the actual tokenization, it works really well and can be very fast.)

    #parsing

  4. … the #Pygments lexer for #CommonLisp treats ‘defun’ and ‘list*’ (among other symbols) as basically the same thing. They're both ‘builtins’.

    As the kids say, my disappointment is immeasurable.

    Ggghh, emacs --batch + font-lock-fontify-buffer + output the face changes maybe?

  5. … the #Pygments lexer for #CommonLisp treats ‘defun’ and ‘list*’ (among other symbols) as basically the same thing. They're both ‘builtins’.

    As the kids say, my disappointment is immeasurable.

    Ggghh, emacs --batch + font-lock-fontify-buffer + output the face changes maybe?

  6. ⚡ Feature: Lexer improvements

    Running some unittests on the latest code regarding the lexer improvements made by Gustav

    PR: deavmi.assigned.network/git/tl

    #tlang #compilers

  7. I found a lexer for Macros in . Now I can use co-pilot for coding ijm.

  8. Me: keyword

    Lexer: looks good to me

    Me: keyword

    Lexer: yep no problem

    Me: so what is this keyword

    Lexer: identifier

    #parsers #lexers

  9. Customer data platform Lexer raises $25.5M Series B for global expansion - Left to right: Lexer founders Dave Whittle, Aaron Wallis, Chris Brewer
    The massive shift to online s... - feedproxy.google.com/~r/Techcr #customerdataplatform #fundings&exits #australia #ecommerce #startups #lexer #tc

  10. is a fast generator.

    re2c is a generator for or programs that tokenize an input string and execute code based on the tokens found. re2c uses regexes (with optional submatches) for pattern matching, which is performed using a highly efficient goto-based state machine. re2c is very configurable and can accept many different input encodings.

    Website 🔗️: re2c.org/

    apt 📦️: re2c

  11. My brain wants to make this lexer instance OO for encapsulation reasons

    but Ada says "Umm, that's not your type, you ain't taking a pointer to it"

  12. SBPL Toolchain v1.0.0 — Lexer, parser, and VS Code extension for Apple's Sandbox Profile Language (.sb files)

    github.com/g-cqd/sbpl-toolchain

    #Swift #macOS #iOS #VSCode #OpenSource #AppleDev

  13. Does anyone here know #pygments or #chroma lexer stuff well enough to tell me how (if at all?) one can mark up stuff in *multiple ways at once*?

    For instance the following line:

    # foo **bar**

    The entire line should be a "Heading", the first two characters and the stars should be "Keyword"s, and the bar should be "GenericStrong".

    I can't for the life of me figure out how that's possible to achieve.

    CC @avghelper

  14. Does anyone here know #pygments or #chroma lexer stuff well enough to tell me how (if at all?) one can mark up stuff in *multiple ways at once*?

    For instance the following line:

    # foo **bar**

    The entire line should be a "Heading", the first two characters and the stars should be "Keyword"s, and the bar should be "GenericStrong".

    I can't for the life of me figure out how that's possible to achieve.

    CC @avghelper

  15. Damn. My PLY lexer doesn't recognize hexadecimal numbers, it recognizes them as decimal(0) and symbol(x1).

    edit: oh, it's a matter of definition order.

    #parser #lexer

  16. Overdue post on new features implemented in the #trbot #rust lexer.

    First up is the generalization of the concept of grouped inputs. Anything in a grouping is in between [] and can be linked with any other instruction type to execute everything in parallel.

  17. Programming languages are absolutely mortified of strings, aren’t they. All those arbitrary words with their alien systems of meaning. They have nightmares about stumbling on an unterminated “ AND THEN EVERYTHI #programming #lexer #syntax #electricsheep

  18. 🚀💨 This 8,277-word masterpiece on #turbocharging #lexers is a snooze-fest of #jargon that only a #computer #science major could love. 😴 With terms like "purple gardens" and "zero alloc string windows," it sounds more like avant-garde #poetry than #programming advice. If you're looking for a quick nap, this might be the 'lexer' you're searching for. 💤
    xnacly.me/posts/2025/fast-lexe #HackerNews #ngated

  19. Как создать свой парсер и AST-генератор на C++ с минимальными усилиями: знакомьтесь с QapDSLv2

    QapDSLv2: Новый стандарт AST-heavy парсинга QapDSLv2 обеспечивает: Молниеносное построение AST Полное сохранение структуры исходного кода Простоту интерпретации и модификации грамматик Забудьте о любы других парсерах! С помощью QapDSLv2 можно создавать компиляторы/анализаторы/форматировщики кода за минуты/часы. // почти наглая лож Парсеры и генерация абстрактных синтаксических деревьев ( AST ) — это обычно долго, сложно и требует тонны шаблонного кода. Но что если я скажу, что теперь можно описывать грамматики и структуры данных одновременно и получать готовый, оптимизированный C++ код автоматически? QapDSLv2 — новый стандарт эффективности и удобства в парсинге . Это язык описания парсеров, который избавляет от синтаксического шума, упрощает интеграцию с C++ и позволяет создавать сложные анализаторы без боли и ошибок . Забудьте о бесконечных циклах отладки и непонятных генераторах — теперь всё просто, понятно и эффективно. В этой статье вы узнаете, как QapDSL v2 меняет правила игры в мире парсинга и компиляторов, увидите реальные примеры и поймёте, почему это важно для каждого, кто работает с языками программирования и обработкой текста . Готовы ускорить разработку и вывести свои проекты на новый уровень? QapGen — мощный генератор парсеров, построенный на основе QapDSLv2, который из грамматик QapDSLv2 сразу создаёт высокопроизводительный C++ парсер с типизированным AST , описанным прямо в грамматике. t_sep { string body = any (" \t\r\n"); } using " " as t_sep; t_value{ TAutoPtr<i_value> body; " "? } t_comma_value{ "," t_value body; " "? } t_array=>i_value{ "[" " "? t_value first?; vector<t_comma_value> arr?; "]" " "? }

    habr.com/ru/articles/922128/

    #QapDSL #Lexers #AST #Compilers #Parser #parsergenerator #Parsers #C++ #dsl

  20. Un 19 de febrer de 1993 es va fer l'estrena als EUA de
    L'EXÈRCIT DE LES TENEBRES (1992)
    d'en Sam Raimi.

    Ja se n'havien fet diverses projeccions i fins i tot alguna estrena a l'Àsia el 1992.

    #19febrer
    #LExèrcitDeLesTenebres
    #ArmyOfDarkness

  21. SemVer grammar for Plex

    Supports referencing other grammar rules by name using the (?&NAME) syntax, where NAME is the name of the grammar rule.

    github.com/ghostwriter/plex

    packagist.org/packages/ghostwr

    #php #lexer

  22. Long time no post. The #rust TRBot lexer is feature complete now, and I ran a few tests with the #type2play community to test how well it works. There was a desire from players to use the old syntax, so I ported the legacy parser from #csharp to Rust and implemented a way to convert new inputs to old on the flly. This will help gradually ease players into using the new syntax.

    Yesterday, I added the first new feature - linking repeated inputs with other inputs.

    #trbot #freesoftware #foss

  23. Writing my first bottom up parser. I want my xml lexer to give the doctype as one token but to do that I need to parse the internal subset -> markup decl -> element decl -> content spec -> children which has

    [47] children ::= (choice | seq) ('?' | '*' | '+')?
    [48] cp ::= (Name | choice | seq) ('?' | '*' | '+')?
    [49] choice ::= '(' S? cp ( S? '|' S? cp )+ S? ')'
    [50] seq ::= '(' S? cp ( S? ',' S? cp )* S? ')'

    As its grammer. Notice the recursion. I would normally use a recursive decent parser but since I'm using Rust's coroutines I can't have recursive coroutines(as far as I am aware).

    I'm using coroutines because this is a streaming parser meant for embedded systems with very little memory. At any point I could run out of input which is when I yield back up to get more. My previous iteration of this was a massive state machine essentially implementing coroutines from scratch.

    #rust #embedded #coroutines #xml

  24. Writing my first bottom up parser. I want my xml lexer to give the doctype as one token but to do that I need to parse the internal subset -> markup decl -> element decl -> content spec -> children which has

    [47] children ::= (choice | seq) ('?' | '*' | '+')?
    [48] cp ::= (Name | choice | seq) ('?' | '*' | '+')?
    [49] choice ::= '(' S? cp ( S? '|' S? cp )+ S? ')'
    [50] seq ::= '(' S? cp ( S? ',' S? cp )* S? ')'

    As its grammer. Notice the recursion. I would normally use a recursive decent parser but since I'm using Rust's coroutines I can't have recursive coroutines(as far as I am aware).

    I'm using coroutines because this is a streaming parser meant for embedded systems with very little memory. At any point I could run out of input which is when I yield back up to get more. My previous iteration of this was a massive state machine essentially implementing coroutines from scratch.

    #rust #embedded #coroutines #xml

  25. Writing my first bottom up parser. I want my xml lexer to give the doctype as one token but to do that I need to parse the internal subset -> markup decl -> element decl -> content spec -> children which has

    [47] children ::= (choice | seq) ('?' | '*' | '+')?
    [48] cp ::= (Name | choice | seq) ('?' | '*' | '+')?
    [49] choice ::= '(' S? cp ( S? '|' S? cp )+ S? ')'
    [50] seq ::= '(' S? cp ( S? ',' S? cp )* S? ')'

    As its grammer. Notice the recursion. I would normally use a recursive decent parser but since I'm using Rust's coroutines I can't have recursive coroutines(as far as I am aware).

    I'm using coroutines because this is a streaming parser meant for embedded systems with very little memory. At any point I could run out of input which is when I yield back up to get more. My previous iteration of this was a massive state machine essentially implementing coroutines from scratch.

    #rust #embedded #coroutines #xml

  26. Writing my first bottom up parser. I want my xml lexer to give the doctype as one token but to do that I need to parse the internal subset -> markup decl -> element decl -> content spec -> children which has

    [47] children ::= (choice | seq) ('?' | '*' | '+')?
    [48] cp ::= (Name | choice | seq) ('?' | '*' | '+')?
    [49] choice ::= '(' S? cp ( S? '|' S? cp )+ S? ')'
    [50] seq ::= '(' S? cp ( S? ',' S? cp )* S? ')'

    As its grammer. Notice the recursion. I would normally use a recursive decent parser but since I'm using Rust's coroutines I can't have recursive coroutines(as far as I am aware).

    I'm using coroutines because this is a streaming parser meant for embedded systems with very little memory. At any point I could run out of input which is when I yield back up to get more. My previous iteration of this was a massive state machine essentially implementing coroutines from scratch.

    #rust #embedded #coroutines #xml