I’ve hit a most frustrating problem. I am creating a new Venturii Buscom plugin using libmodbus, but before I got very far I became stumped with a strange problem. Here is the code:
#include <stdio.h>
#include <errno.h>
#include <modbus.h>
int main(void) {
modbus_t *mb;
uint16_t tab_reg[32];
mb = modbus_new_tcp("192.168.1.100", 502);
modbus_set_debug(mb, 1);
modbus_set_error_recovery(mb,
MODBUS_ERROR_RECOVERY_LINK |
MODBUS_ERROR_RECOVERY_PROTOCOL);
modbus_connect(mb);
/* Read 5 registers from the address 0 */
int n = modbus_read_registers(mb, 0, 16, tab_reg);
printf("Result: [%d]\n", n);
if (n == -1) {
printf("Error: [%s]\n", modbus_strerror(errno));
}
int i;
printf("Register Data: ");
for (i=0; i<n; i++) {
printf("[%02x] ", tab_reg[i]);
}
printf("\n");
modbus_close(mb);
modbus_free(mb);
}
I downloaded the latest source for libmodbus and installed it into /usr/local. Therefore, I am compiling with:
gcc mytest.c -I /usr/local/include/modbus/ -L /usr/local/lib/ -lmodbus
The output:
Connecting to 192.168.1.100:502
[00][01][00][00][00][06][FF][03][00][00][00][10]
Waiting for a confirmation...
<00><01><00><00><00><23><FF><03><10><00><00><00><00><00><00><00><0F><00><00><00><00><00><00><00><0A>
Message length not corresponding to the computed length (25 != 41)
Bytes flushed (16)
Result: [-1]
Error: [Invalid data]
Register Data:
It does not make any difference if I choose a smaller number of bytes to read, the result is always a complaint from libmodbus that not enough bytes were received from the slave. What is strange to me though, is that the remaining bytes are there! In Wireshark, I see the entire query was sent from my test program to the modbus simulator in a single packet:
And the response from the simulator is delivered in a single packet – containing the correct data. (The bytes line up with the actual bytes in the simulator’s register).
Perplexing me further is the fact that, due to the error handling parameters I’ve set on my modbus context, you can see that after claiming there is not enough data, libmodbus then flushes the remainder of the buffer – which mysteriously contains the exact number of missing bytes.
In order to understand why this is not working, the next step, I thought, would be to analyze the method by which libmodbus computes the actual number of bytes it receives. We are going to make several assumptions off the bat: First, I am assuming that it has already correctly identified the number of bytes that it is expecting. Second, I’m going to assume that the operating system is not doing something funny “under the hood” with the incoming data. Since we can see the entire response in a single packet, I’m assuming that the entire packet was available at the sockfd for reading.
Looking at the definition for the MODBUS TCP Packet format (from Wikipedia)
Again, here is our received data:
<00><01><00><00><00><23><FF><03><10><00><00><00><00><00><00><00><0F><00><00><00><00><00><00><00><0A>
The first two bytes, 00 01, are our TID or Transaction Identifier. Check.
The next two bytes are both 00, the correct values for Modbus/TCP. Check.
Length field: 00 23 = Number of bytes remaining in this frame. So far we’ve read 6 bytes into the frame, and 0x23 in Hex is 35 in Base-10 or Decimal. 35 + 6 = 41. So far, so good. 0xFF is our slave address, Next we have our function code, 0x03 (Read Registers), and thus begins our data with 0x10, which is 16 in decimal and in this context indicates that there are 16 words coming.
All of that appears correct to me. So why is libmodbus 3.0.8 and 3.1.6 both complaining that not enough bytes are in the message, and the promptly flushing all the missing bytes? Why can I find nothing online about this error message? Is no one else seeing results like this? Libmodbus is part of every major Linux distribution, so either no one is using it anymore, or it probably works just fine, leaving the problem back in my court. Could this be some OS level thing? Buffering, perhaps, of some kind?
Pondering that last query, I decided to try my test code on a new VM. Actually, I pulled a new Fedora 31 container, updated everything on it and ran the same test code above. Would you like to take a guess at the results?
[root@0f574053a251 /]# gcc mytest.c -I /usr/include/modbus/ -lmodbus
[root@0f574053a251 /]# ./a.out
Connecting to 192.168.1.100
[00][01][00][00][00][06][FF][03][00][00][00][10]
Waiting for a confirmation...
<00><01><00><00><00><23><FF><03><10><00><00><00><00><00><00><00><0F><00><00><00><00><00><00><00><0A>
Message length not corresponding to the computed length (25 != 41)
16 bytes flushed
Result: [-1]
Error: [Invalid data]
Register Data:
[root@0f574053a251 /]#
It’s worthy to note that even Fedora 31 still carries the 3.0.8 version of libmodbus, probably because it was the most stable release. None the less – the exact same result! Now what? I thought I might run some of the built-in tests that come with the source version.
TEST INVALID INITIALIZATION:
The device string is empty
OK
The baud rate value must not be zero
OK
The service string is empty
OK
ALL TESTS PASS WITH SUCCESS.
Ok, so the library thinks it is sane… I wonder what it’s data looked like when it tried to utilize the 0x03 function code. (There was a lot of data spit out in both the client and server test programs… Scrolling back I found this:
[00][00][00][00][00][06][FF][03][01][60][00][00]
* try function 0x3: read 0 values: Waiting for a confirmation...
<00><00><00><00><00><03><FF><83><03>
OK
[00][00][00][00][00][06][FF][03][01][60][00][7E]
* try function 0x3: read 126 values: Waiting for a confirmation...
<00><00><00><00><00><03><FF><83><03>
OK
It doesn’t matter how many bytes I request, the result always appears to be short. In this next example, I attempt to read only one register:
[00][01][00][00][00][06][00][03][00][00][00][01]
Waiting for a confirmation...
<00><01><00><00><00><05><00><03><00>
Message length not corresponding to the computed length (9 != 11)
Bytes flushed (2)
If I call modbus_get_bits() instead, and request a single bit:
[00][01][00][00][00][06][00][01][00][00][00][01]
Waiting for a confirmation...
<00><01><00><00><00><04><00><01><00>
Message length not corresponding to the computed length (9 != 10)
Bytes flushed (1)
I am still one byte short, and then it merrily flushes one byte.
I got my first break when I tried to request 4 bits using modbus_get_bits() (function code 0x01) :
[00][01][00][00][00][06][FF][01][00][01][00][04]
Waiting for a confirmation...
<00><01><00><00><00><04><FF><01><01><02>
Result: [4]
Register Data: [00] [01] [00] [00]
Wait, what?! That worked??? I tried other quantities:
[00][01][00][00][00][06][FF][01][00][01][00][08]
Waiting for a confirmation...
<00><01><00><00><00><04><FF><01><01><02>
Result: [8]
Register Data: [00] [01] [00] [00] [00] [00] [00] [00]
(Requesting 128 bits starting at address 1)
[00][01][00][00][00][06][FF][01][00][01][00][80]
Waiting for a confirmation...
<00><01><00><00><00><13><FF><01><08><04><04><08><24><00><00><00><00>
Message length not corresponding to the computed length (17 != 25)
Bytes flushed (8)
(Requesting 9 bits starting at address 1)
[00][01][00][00][00][06][FF][01][00][01][00][09]
Waiting for a confirmation...
<00><01><00><00><00><05><FF><01><01><02>
Message length not corresponding to the computed length (10 != 11)
Bytes flushed (1)
(Requested 8 bits again from address 1 to make sure I wasn't going crazy)
[00][01][00][00][00][06][FF][01][00][01][00][08]
Waiting for a confirmation...
<00><01><00><00><00><04><FF><01><01><02>
Result: [8]
Register Data: [00] [01] [00] [00] [00] [00] [00] [00]
Message length not corresponding to the computed length (51 != 93)
Bytes flushed (42)
Now I am even more confused than ever. Some quantities of blocks requested return data properly, others fail – In further testing I started to notice a pattern – any time I requested an even number of blocks from a register, for example, it would receive the correct number of bytes MINUS the number of bytes I’d requested… every time. If I requested 100 registers, I’d get “Message length not corresponding to the computed length (109 != 209) … Bytes flushed (100)”. If I’d request 42 registers, it would reply, “Message length not corresponding to the computed length (51 != 93) … Bytes flushed (42)”. Any odd number of requested registers returned that number plus 1 bytes missing, such as this request for 43 blocks: “Message length not corresponding to the computed length (51 != 95) … Bytes flushed (44)”.
I started to wonder if there was some miscalculation in the library, and with nothing else I could think to try, it was at this point that I decided to dive into the source code for libmodbus. My first step was to add some debugging output at various points along the way as it grabs the response byte by byte. Here you can see the entire output of my test run, requesting 16 registers starting at address 1:
[00][01][00][00][00][06][FF][03][00][01][00][10]
Waiting for a confirmation...
_modbus_receive_msg(DEBUG): Length to read: [8]
_modbus_receive_msg(DEBUG): recv(msg_length: [0] length_to_read: [8]) retval: [8]
<00><01><00><00><00><23><FF><03>_modbus_receive_msg(DEBUG): Length to read: [1]
_modbus_receive_msg(DEBUG): recv(msg_length: [8] length_to_read: [1]) retval: [1]
<10>_modbus_receive_msg(DEBUG): Length to read: [16]
_modbus_receive_msg(DEBUG): recv(msg_length: [9] length_to_read: [16]) retval: [16]
<00><00><00><00><00><0F><00><00><00><00><00><00><00><0A><00><00>
check_confirmation(DEBUG): rsp_length_computed: [41]
Message length not corresponding to the computed length (25 != 41)
Bytes flushed (16)
It reads the first 8 bytes to figure out what this message is about. It then asks for one byte to determine the number of registers that are following, and we get <10>, which of course is 16 in decimal. But look at the next line: It asks for 16 more bytes, not 32 as it should. Keep in mind that these holding registers are 16 bits wide each. Therefore, either the library has interpreted 16 to mean 16 8 bit blocks (which I’d suspect would be incorrect) or the modbus simulator I am using is incorrectly reporting words when it should be reporting bytes and “confusing” the library into thinking only 16 more bytes are coming when it has already computed that there should be 32 here. Instead of flipping a coin, I decided to turn to the MODBUS specification to find out what this value *should* be, and if there are circumstances that would provide that either could be correct given special circumstances.
There you have it, libmodbus is correct – this field represents the number of 8-bit bytes that should follow in the response. Therefore, it appears that my simulator is to blame here. Looking now towards MOD_RSSIM, the MODBUS simulator I was using, I had version 6.7. It turns out there are a lot of newer versions since then, and the tool is still actively developed. I will download and try the new version tomorrow and see what my results are then.
It is now the morning, and with MOD_RSSIM version 8.20 in the stead of 6.7, I re-ran all of my tests again and guess what? Everything works! What a rabbit hole to have fallen into, but such is the nature of software development, problem solving, and discovery! Now to finish the Venturii Buscom Modbus plugin and move onto bigger and brighter things!