#bson — Public Fediverse posts
Live and recent posts from across the Fediverse tagged #bson, aggregated by home.social.
-
#ITByte: A well-designed data format is dictated by what makes the information the easiest for the intended audience to understand. While exchanging data between two systems.
#Data formats for the #Web - a short overview and comparison of #XML, #JSON, #BSON, #YAML and more...
https://knowledgezone.co.in/posts/Data-format-for-the-Web-6131d5b7c84fbdec9b873510
-
Проектируем как синьор: универсальная бинаризация
Здравствуйте, меня зовут Дмитрий Карловский и я.. да не важно кто я. Важно о чём я говорю, и как аргументирую. Кто меня знает, тому и не надо рассказывать. А кто не знает — у того есть прекрасная возможность подойти к вопросу с чистым разумом. А это крайне важно, если мы хотим спроектировать что-то по настоящему хорошо, а не как обычно. Что ещё за VaryPack?
-
Проектируем как синьор: универсальная бинаризация
Здравствуйте, меня зовут Дмитрий Карловский и я.. да не важно кто я. Важно о чём я говорю, и как аргументирую. Кто меня знает, тому и не надо рассказывать. А кто не знает — у того есть прекрасная возможность подойти к вопросу с чистым разумом. А это крайне важно, если мы хотим спроектировать что-то по настоящему хорошо, а не как обычно. Что ещё за VaryPack?
-
Проектируем как синьор: универсальная бинаризация
Здравствуйте, меня зовут Дмитрий Карловский и я.. да не важно кто я. Важно о чём я говорю, и как аргументирую. Кто меня знает, тому и не надо рассказывать. А кто не знает — у того есть прекрасная возможность подойти к вопросу с чистым разумом. А это крайне важно, если мы хотим спроектировать что-то по настоящему хорошо, а не как обычно. Что ещё за VaryPack?
-
Проектируем как синьор: универсальная бинаризация
Здравствуйте, меня зовут Дмитрий Карловский и я.. да не важно кто я. Важно о чём я говорю, и как аргументирую. Кто меня знает, тому и не надо рассказывать. А кто не знает — у того есть прекрасная возможность подойти к вопросу с чистым разумом. А это крайне важно, если мы хотим спроектировать что-то по настоящему хорошо, а не как обычно. Что ещё за VaryPack?
-
JSON? JSONB? BSON? CBOR? MsgPack? А, VaryPackǃ
VaryPack - новый, простой, гибкий, шустрый и компактный формат бинарной сериализации произвольных данных. Что за модная тема?
-
JSON? JSONB? BSON? CBOR? MsgPack? А, VaryPackǃ
VaryPack - новый, простой, гибкий, шустрый и компактный формат бинарной сериализации произвольных данных. Что за модная тема?
-
JSON? JSONB? BSON? CBOR? MsgPack? А, VaryPackǃ
VaryPack - новый, простой, гибкий, шустрый и компактный формат бинарной сериализации произвольных данных. Что за модная тема?
-
JSON? JSONB? BSON? CBOR? MsgPack? А, VaryPackǃ
VaryPack - новый, простой, гибкий, шустрый и компактный формат бинарной сериализации произвольных данных. Что за модная тема?
-
CW: Stop using JSON everywhere. [Long post]
I remember some article telling how a company was constantly hitting AWS quotas, because their JSON payload, itself fitting into limits, was put into a string field inside another JSON object, used for communication between servers, therefore all quotes and backslashes were double-escaped as
\"and\\, increasing payload size.String encoding is also the reason why, for example, serde (Rust de-/serialization library) can give you
Cow<str>if you don't want extra allocations: when there are no character escapes, a reference to the original input can be passed, but in case we're doing\"->"replacements, we need to copy this part of input anyway.
I don't say it's bad, that is how a text format works in common, not only JSON. But if you need to put some arbitrary data into objects, think twice, probably a binary format like MessagePack, BSON or even custom ProtoBuf will be much more efficient for your task.Also, text formats are basically not suitable for streaming, while loading a big object into RAM is a very bad idea. If it's an array, you can separate objects by newline instead of using JSON's
[ ]. In other cases, search for a SAX-like library (or smth like "stream json") for your programming language.Now I won't give a specific example, but I'm sure there are developers doing this: encoding a file with base64 to send it inside a JSON request. Please, remember that b64 bloats payload approximately by 1.33x [^1], so you should always either send a file with an additional HTTP request or use multipart form data type. Oh, or encode your objects with a binary format. Last two options are OK when you're working with small files and insist on doing everything in one request, otherwise upload data in different reqs in parallel.
[^1] formula for base64 string length is:
4 * ceil(original_length / 3)Another example of "how definitely NOT to do" is Piped (privacy frontend for YouTube), on some API endpoints it provides a nextpage object, containing session info used to request the next page for a channel, a playlist, search results or comments, and the problem is that it's a JSON object put inside a string as explained above:
"nextpage":"{\"url\":\"https…
Even funnier, there arebodyfield inside this nextpage object that contains another JSON object, encoded in base64, so there are 3 layers of text format encoding.
And when a client requests the next page, the object is sent in GET querystring parameters, so it gets urlencoded (percent-encoded), resulting in 4 layers!! Idk why browsers don't reject its long ugly URLs.
Everything before querystring is excusable if the internal YT API itself requires such format for a context/session object. Invidious doesn't care about context at all and sends a clean request, if I got it right.And the most stupid JSON usecase is JWT, I think. It encodes already-plaintext format with base64 (intended for converting binary data to ASCII text; the same as in Piped, but we forgave it), moreover, it does this to 2 objects, and stores a token with such a big overhead in cookies.
By the way, want a JSON config in your software? Take a look at Hjson that is much more convenient for writing by hand.
-
CW: Stop using JSON everywhere. [Long post]
I remember some article telling how a company was constantly hitting AWS quotas, because their JSON payload, itself fitting into limits, was put into a string field inside another JSON object, used for communication between servers, therefore all quotes and backslashes were double-escaped as
\"and\\, increasing payload size.String encoding is also the reason why, for example, serde (Rust de-/serialization library) can give you
Cow<str>if you don't want extra allocations: when there are no character escapes, a reference to the original input can be passed, but in case we're doing\"->"replacements, we need to copy this part of input anyway.
I don't say it's bad, that is how a text format works in common, not only JSON. But if you need to put some arbitrary data into objects, think twice, probably a binary format like MessagePack, BSON or even custom ProtoBuf will be much more efficient for your task.Also, text formats are basically not suitable for streaming, while loading a big object into RAM is a very bad idea. If it's an array, you can separate objects by newline instead of using JSON's
[ ]. In other cases, search for a SAX-like library (or smth like "stream json") for your programming language.Now I won't give a specific example, but I'm sure there are developers doing this: encoding a file with base64 to send it inside a JSON request. Please, remember that b64 bloats payload approximately by 1.33x [^1], so you should always either send a file with an additional HTTP request or use multipart form data type. Oh, or encode your objects with a binary format. Last two options are OK when you're working with small files and insist on doing everything in one request, otherwise upload data in different reqs in parallel.
[^1] formula for base64 string length is:
4 * ceil(original_length / 3)Another example of "how definitely NOT to do" is Piped (privacy frontend for YouTube), on some API endpoints it provides a nextpage object, containing session info used to request the next page for a channel, a playlist, search results or comments, and the problem is that it's a JSON object put inside a string as explained above:
"nextpage":"{\"url\":\"https…
Even funnier, there arebodyfield inside this nextpage object that contains another JSON object, encoded in base64, so there are 3 layers of text format encoding.
And when a client requests the next page, the object is sent in GET querystring parameters, so it gets urlencoded (percent-encoded), resulting in 4 layers!! Idk why browsers don't reject its long ugly URLs.
Everything before querystring is excusable if the internal YT API itself requires such format for a context/session object. Invidious doesn't care about context at all and sends a clean request, if I got it right.And the most stupid JSON usecase is JWT, I think. It encodes already-plaintext format with base64 (intended for converting binary data to ASCII text; the same as in Piped, but we forgave it), moreover, it does this to 2 objects, and stores a token with such a big overhead in cookies.
By the way, want a JSON config in your software? Take a look at Hjson that is much more convenient for writing by hand.
-
CW: Stop using JSON everywhere. [Long post]
I remember some article telling how a company was constantly hitting AWS quotas, because their JSON payload, itself fitting into limits, was put into a string field inside another JSON object, used for communication between servers, therefore all quotes and backslashes were double-escaped as
\"and\\, increasing payload size.String encoding is also the reason why, for example, serde (Rust de-/serialization library) can give you
Cow<str>if you don't want extra allocations: when there are no character escapes, a reference to the original input can be passed, but in case we're doing\"->"replacements, we need to copy this part of input anyway.
I don't say it's bad, that is how a text format works in common, not only JSON. But if you need to put some arbitrary data into objects, think twice, probably a binary format like MessagePack, BSON or even custom ProtoBuf will be much more efficient for your task.Also, text formats are basically not suitable for streaming, while loading a big object into RAM is a very bad idea. If it's an array, you can separate objects by newline instead of using JSON's
[ ]. In other cases, search for a SAX-like library (or smth like "stream json") for your programming language.Now I won't give a specific example, but I'm sure there are developers doing this: encoding a file with base64 to send it inside a JSON request. Please, remember that b64 bloats payload approximately by 1.33x [^1], so you should always either send a file with an additional HTTP request or use multipart form data type. Oh, or encode your objects with a binary format. Last two options are OK when you're working with small files and insist on doing everything in one request, otherwise upload data in different reqs in parallel.
[^1] formula for base64 string length is:
4 * ceil(original_length / 3)Another example of "how definitely NOT to do" is Piped (privacy frontend for YouTube), on some API endpoints it provides a nextpage object, containing session info used to request the next page for a channel, a playlist, search results or comments, and the problem is that it's a JSON object put inside a string as explained above:
"nextpage":"{\"url\":\"https…
Even funnier, there arebodyfield inside this nextpage object that contains another JSON object, encoded in base64, so there are 3 layers of text format encoding.
And when a client requests the next page, the object is sent in GET querystring parameters, so it gets urlencoded (percent-encoded), resulting in 4 layers!! Idk why browsers don't reject its long ugly URLs.
Everything before querystring is excusable if the internal YT API itself requires such format for a context/session object. Invidious doesn't care about context at all and sends a clean request, if I got it right.And the most stupid JSON usecase is JWT, I think. It encodes already-plaintext format with base64 (intended for converting binary data to ASCII text; the same as in Piped, but we forgave it), moreover, it does this to 2 objects, and stores a token with such a big overhead in cookies.
By the way, want a JSON config in your software? Take a look at Hjson that is much more convenient for writing by hand.
-
CW: Stop using JSON everywhere. [Long post]
I remember some article telling how a company was constantly hitting AWS quotas, because their JSON payload, itself fitting into limits, was put into a string field inside another JSON object, used for communication between servers, therefore all quotes and backslashes were double-escaped as
\"and\\, increasing payload size.String encoding is also the reason why, for example, serde (Rust de-/serialization library) can give you
Cow<str>if you don't want extra allocations: when there are no character escapes, a reference to the original input can be passed, but in case we're doing\"->"replacements, we need to copy this part of input anyway.
I don't say it's bad, that is how a text format works in common, not only JSON. But if you need to put some arbitrary data into objects, think twice, probably a binary format like MessagePack, BSON or even custom ProtoBuf will be much more efficient for your task.Also, text formats are basically not suitable for streaming, while loading a big object into RAM is a very bad idea. If it's an array, you can separate objects by newline instead of using JSON's
[ ]. In other cases, search for a SAX-like library (or smth like "stream json") for your programming language.Now I won't give a specific example, but I'm sure there are developers doing this: encoding a file with base64 to send it inside a JSON request. Please, remember that b64 bloats payload approximately by 1.33x [^1], so you should always either send a file with an additional HTTP request or use multipart form data type. Oh, or encode your objects with a binary format. Last two options are OK when you're working with small files and insist on doing everything in one request, otherwise upload data in different reqs in parallel.
[^1] formula for base64 string length is:
4 * ceil(original_length / 3)Another example of "how definitely NOT to do" is Piped (privacy frontend for YouTube), on some API endpoints it provides a nextpage object, containing session info used to request the next page for a channel, a playlist, search results or comments, and the problem is that it's a JSON object put inside a string as explained above:
"nextpage":"{\"url\":\"https…
Even funnier, there arebodyfield inside this nextpage object that contains another JSON object, encoded in base64, so there are 3 layers of text format encoding.
And when a client requests the next page, the object is sent in GET querystring parameters, so it gets urlencoded (percent-encoded), resulting in 4 layers!! Idk why browsers don't reject its long ugly URLs.
Everything before querystring is excusable if the internal YT API itself requires such format for a context/session object. Invidious doesn't care about context at all and sends a clean request, if I got it right.And the most stupid JSON usecase is JWT, I think. It encodes already-plaintext format with base64 (intended for converting binary data to ASCII text; the same as in Piped, but we forgave it), moreover, it does this to 2 objects, and stores a token with such a big overhead in cookies.
By the way, want a JSON config in your software? Take a look at Hjson that is much more convenient for writing by hand.
-
CW: Stop using JSON everywhere. [Long post]
I remember some article telling how a company was constantly hitting AWS quotas, because their JSON payload, itself fitting into limits, was put into a string field inside another JSON object, used for communication between servers, therefore all quotes and backslashes were double-escaped as
\"and\\, increasing payload size.String encoding is also the reason why, for example, serde (Rust de-/serialization library) can give you
Cow<str>if you don't want extra allocations: when there are no character escapes, a reference to the original input can be passed, but in case we're doing\"->"replacements, we need to copy this part of input anyway.
I don't say it's bad, that is how a text format works in common, not only JSON. But if you need to put some arbitrary data into objects, think twice, probably a binary format like MessagePack, BSON or even custom ProtoBuf will be much more efficient for your task.Also, text formats are basically not suitable for streaming, while loading a big object into RAM is a very bad idea. If it's an array, you can separate objects by newline instead of using JSON's
[ ]. In other cases, search for a SAX-like library (or smth like "stream json") for your programming language.Now I won't give a specific example, but I'm sure there are developers doing this: encoding a file with base64 to send it inside a JSON request. Please, remember that b64 bloats payload approximately by 1.33x [^1], so you should always either send a file with an additional HTTP request or use multipart form data type. Oh, or encode your objects with a binary format. Last two options are OK when you're working with small files and insist on doing everything in one request, otherwise upload data in different reqs in parallel.
[^1] formula for base64 string length is:
4 * ceil(original_length / 3)Another example of "how definitely NOT to do" is Piped (privacy frontend for YouTube), on some API endpoints it provides a nextpage object, containing session info used to request the next page for a channel, a playlist, search results or comments, and the problem is that it's a JSON object put inside a string as explained above:
"nextpage":"{\"url\":\"https…
Even funnier, there arebodyfield inside this nextpage object that contains another JSON object, encoded in base64, so there are 3 layers of text format encoding.
And when a client requests the next page, the object is sent in GET querystring parameters, so it gets urlencoded (percent-encoded), resulting in 4 layers!! Idk why browsers don't reject its long ugly URLs.
Everything before querystring is excusable if the internal YT API itself requires such format for a context/session object. Invidious doesn't care about context at all and sends a clean request, if I got it right.And the most stupid JSON usecase is JWT, I think. It encodes already-plaintext format with base64 (intended for converting binary data to ASCII text; the same as in Piped, but we forgave it), moreover, it does this to 2 objects, and stores a token with such a big overhead in cookies.
By the way, want a JSON config in your software? Take a look at Hjson that is much more convenient for writing by hand.
-
#ITByte: A well-designed data format is dictated by what makes the information the easiest for the intended audience to understand.
#Data formats for the #Web - a short overview and comparison of #XML, #JSON, #BSON, #YAML and more...
https://knowledgezone.co.in/posts/Data-format-for-the-Web-6131d5b7c84fbdec9b873510
-
In #python is #setuptools special and assumed to be always installed even though it isn't part of the standard library? (sort of a phantom stdlib)
e.g. is this setup.py file from #bson wrong, in that it import setuptools but isn't in the list of dependencies?
https://github.com/py-bson/bson/blob/master/setup.pyanyhow, install blew up on my build
-
#ITByte: A well-designed data format is dictated by what makes the information the easiest for the intended audience to understand. While exchanging data between two systems,
#Data formats for the #Web - a short overview and comparison of #XML, #JSON, #BSON, #YAML and more...
https://knowledgezone.co.in/posts/Data-format-for-the-Web-6131d5b7c84fbdec9b873510
-
Connaissez-vous la RFC 8949 ? Non ? Jusqu'à ce matin, à ma grande honte, moi non plus... Pourtant le sujet est d'importance : une alternative binaire, compacte, performante, normée et pérenne. Le #CBOR: Concise Binary Object Representation.
https://cbor.io/Le seul tuto du site renvoie à un article fr de présentation de @bortzmeyer : https://www.bortzmeyer.org/7049.html
#BSON, #protobuf, #MessagePack : chacun a ses avantages (et inconvénients) face à #JSON.
Le CBOR est une couleur de cette palette. -
AFAIK there is no way to export a simple config of the #Ubiquiti #UniFi network #config.
I've worked around it by:
1. Downloading a backup
2. Using a third party decrypt script to convert backup to zip
3. Extracting files from zip
4. Using #Mongo Tools to convert #BSON to #JSON
5. Parsing in VSCBut it is freaking painful, especially the parts where JSON has been squished into string fields.
🤬! 🤬! 🤬!
-
Binary serialization...
-
#jsoncons is a #singleheader #Cpp #library for parsing #JSON-like formats.
jsoncons has a data model that allows for parsing different formats that resemble JSON, like #CBOR and #BSON, using extensions. jsoncons provides several ways of interacting with parsed data; a query-able structure, a strongly typed C++ class, or a SAX-like parse stream. jsoncons is fast, and has extensions for things like JSONPath.
Website 🔗️: https://danielaparker.github.io/jsoncons/
-
-
@alva @dkl even for embedded things... #BSON, #CBOR, #ProtocollBuffers, #UBJSON and "Smile" are existing things.
Yes you can write your own (or write whatever your env thinks the current binary representation should look like to disk) - but then you have no tooling, no portability and/or no validation/standardisation.