Discussion:
CAN bus transmission timeout
Pavel Kirienko
2013-10-28 13:59:02 UTC
Permalink
Hi list,

Consider the following scenario:

There is a Linux-powered device connected to a CAN bus. The device
periodically transmits the CAN message. The nature of the data carried
by this message is like measurement rather than command, i.e. only the
most recent one is actually valid, and if some messages are lost that
is not an issue as long as the latest one was received successfully.

Then the device in question is being disconnected from the CAN bus for
some amount of time that is much longer than the interval between
subsequent message transmissions. The device logic is still trying to
transmit the messages, but since the bus is disconnected the CAN
controller is unable to transmit any of them so the messages are being
accumulated in the TX queue.

Some time later the CAN bus connection is restored, and all the
accumulated messages are being kicked on the bus one by one.

--- Problem ---

1. When the CAN bus connection is restored, undefined amount of
outdated messages will be transmitted from the TX queue.

2. While the CAN bus connection is still not available but TX queue is
already full, transmission of some most recent messages (i.e. the only
valid messages) will be discarded.

3. Once the CAN bus connection is restored, there would be short term
traffic burst while the TX queue is being flushed. This can alter the
Time Triggered Bus Scheduling if one is used (it is in my case).

--- Question ---

My application uses SocketCAN driver, so basically the question should
be applied to SocketCAN, but other options are considered too if there
are any.

I see two possible solutions: define a message transmission timeout
(if a message was not transmitted during some predefined amount if
time, it will be discarded automatically), or abort transmission of
outdated messages manually (though I doubt it is possible at all with
socket API).

Since the first option seems to be most real to me, the questions are:

1. How does one define TX timeout for CAN interface under Linux?

2. Are there other options exist to solve the problems described
above, aside from TX timeouts?

Thanks in advance,
Pavel Kirienko.

P.S. reposted from http://stackoverflow.com/questions/19633015
--
To unsubscribe from this list: send the line "unsubscribe linux-can" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Kurt Van Dijck
2013-10-28 14:35:58 UTC
Permalink
Hi Pavel,

I believe the queue filling can be solved by reading your own
messages back. As long as you see your own messages, your queue
remains empty.
When your messages is not yet read back, you must assume it did not
get out yet, so it's still in the queue. No need to resend, because
the previous one is yet pending.

When you receive a bus-off indication, you should reset this mechanism.

This address problems 1 & 3.
Problem 2 (old data) remains for 1 transmission period.

receiving your own traffic can be instructed with:

int recv_own_msgs = 1; /* 0 = disabled (default), 1 = enabled */

setsockopt(sock, SOL_CAN_RAW, CAN_RAW_RECV_OWN_MSGS,
&recv_own_msgs, sizeof(recv_own_msgs));

Kind regards,
Kurt
Post by Pavel Kirienko
Hi list,
There is a Linux-powered device connected to a CAN bus. The device
periodically transmits the CAN message. The nature of the data carried
by this message is like measurement rather than command, i.e. only the
most recent one is actually valid, and if some messages are lost that
is not an issue as long as the latest one was received successfully.
Then the device in question is being disconnected from the CAN bus for
some amount of time that is much longer than the interval between
subsequent message transmissions. The device logic is still trying to
transmit the messages, but since the bus is disconnected the CAN
controller is unable to transmit any of them so the messages are being
accumulated in the TX queue.
Some time later the CAN bus connection is restored, and all the
accumulated messages are being kicked on the bus one by one.
--- Problem ---
1. When the CAN bus connection is restored, undefined amount of
outdated messages will be transmitted from the TX queue.
2. While the CAN bus connection is still not available but TX queue is
already full, transmission of some most recent messages (i.e. the only
valid messages) will be discarded.
3. Once the CAN bus connection is restored, there would be short term
traffic burst while the TX queue is being flushed. This can alter the
Time Triggered Bus Scheduling if one is used (it is in my case).
--- Question ---
My application uses SocketCAN driver, so basically the question should
be applied to SocketCAN, but other options are considered too if there
are any.
I see two possible solutions: define a message transmission timeout
(if a message was not transmitted during some predefined amount if
time, it will be discarded automatically), or abort transmission of
outdated messages manually (though I doubt it is possible at all with
socket API).
1. How does one define TX timeout for CAN interface under Linux?
2. Are there other options exist to solve the problems described
above, aside from TX timeouts?
Thanks in advance,
Pavel Kirienko.
P.S. reposted from http://stackoverflow.com/questions/19633015
--
To unsubscribe from this list: send the line "unsubscribe linux-can" in
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Kurt Van Dijck
GRAMMER EiA ELECTRONICS
http://www.eia.be
***@eia.be
+32-38708534
--
To unsubscribe from this list: send the line "unsubscribe linux-can" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Pavel Kirienko
2013-10-28 16:01:18 UTC
Permalink
Hi Kurt,

Thanks.
I assume the most obvious solution - setsockopt SO_SNDTIMEO - is not
availabe with CAN, right? (well, I understand that this feature must
also be supported by CAN device drivers)

According to my understanding, your approach basically boils down to
maintaining a TX queue inside the user space application. Each message
from this user space TX shall be moved into the socket only when it is
confirmed that the previous message has been pushed to the bus
successfully.
Compared to the straightforward approach, this one has one extra IO
per message (to read back my own messages). This shouldn't hurt the
performance a lot I guess, but this increases the inter-message
intervals. OK, we can cope with that.

New questions:

1. Assuming that there might be more than one node transmitting some
particular message, how do I distinguish which messages were actually
received from the bus, and which were loopbacked? I guess I should
open two sockets, one for loopbacks+RX, other just for RX?

2. If an application closes its socket while there are still some
messages to transmit from the TX queue, will these untransmitted
messages be discarded? Or will they wait regardless of the socket's
life cycle? I am asking because it would be nice if there is some way
to drop the transmitting queue entirely in case of bus-off condition.

Best regards,
Pavel.
Post by Kurt Van Dijck
Hi Pavel,
I believe the queue filling can be solved by reading your own
messages back. As long as you see your own messages, your queue
remains empty.
When your messages is not yet read back, you must assume it did not
get out yet, so it's still in the queue. No need to resend, because
the previous one is yet pending.
When you receive a bus-off indication, you should reset this mechanism.
This address problems 1 & 3.
Problem 2 (old data) remains for 1 transmission period.
int recv_own_msgs = 1; /* 0 = disabled (default), 1 = enabled */
setsockopt(sock, SOL_CAN_RAW, CAN_RAW_RECV_OWN_MSGS,
&recv_own_msgs, sizeof(recv_own_msgs));
Kind regards,
Kurt
Post by Pavel Kirienko
Hi list,
There is a Linux-powered device connected to a CAN bus. The device
periodically transmits the CAN message. The nature of the data carried
by this message is like measurement rather than command, i.e. only the
most recent one is actually valid, and if some messages are lost that
is not an issue as long as the latest one was received successfully.
Then the device in question is being disconnected from the CAN bus for
some amount of time that is much longer than the interval between
subsequent message transmissions. The device logic is still trying to
transmit the messages, but since the bus is disconnected the CAN
controller is unable to transmit any of them so the messages are being
accumulated in the TX queue.
Some time later the CAN bus connection is restored, and all the
accumulated messages are being kicked on the bus one by one.
--- Problem ---
1. When the CAN bus connection is restored, undefined amount of
outdated messages will be transmitted from the TX queue.
2. While the CAN bus connection is still not available but TX queue is
already full, transmission of some most recent messages (i.e. the only
valid messages) will be discarded.
3. Once the CAN bus connection is restored, there would be short term
traffic burst while the TX queue is being flushed. This can alter the
Time Triggered Bus Scheduling if one is used (it is in my case).
--- Question ---
My application uses SocketCAN driver, so basically the question should
be applied to SocketCAN, but other options are considered too if there
are any.
I see two possible solutions: define a message transmission timeout
(if a message was not transmitted during some predefined amount if
time, it will be discarded automatically), or abort transmission of
outdated messages manually (though I doubt it is possible at all with
socket API).
1. How does one define TX timeout for CAN interface under Linux?
2. Are there other options exist to solve the problems described
above, aside from TX timeouts?
Thanks in advance,
Pavel Kirienko.
P.S. reposted from http://stackoverflow.com/questions/19633015
--
To unsubscribe from this list: send the line "unsubscribe linux-can" in
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Kurt Van Dijck
GRAMMER EiA ELECTRONICS
http://www.eia.be
+32-38708534
--
To unsubscribe from this list: send the line "unsubscribe linux-can" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Kurt Van Dijck
2013-10-29 06:03:19 UTC
Permalink
Post by Pavel Kirienko
Hi Kurt,
Thanks.
I assume the most obvious solution - setsockopt SO_SNDTIMEO - is not
availabe with CAN, right? (well, I understand that this feature must
also be supported by CAN device drivers)
According to my understanding, your approach basically boils down to
maintaining a TX queue inside the user space application. Each message
from this user space TX shall be moved into the socket only when it is
confirmed that the previous message has been pushed to the bus
successfully.
Compared to the straightforward approach, this one has one extra IO
per message (to read back my own messages). This shouldn't hurt the
performance a lot I guess, but this increases the inter-message
intervals. OK, we can cope with that.
1. Assuming that there might be more than one node transmitting some
particular message, how do I distinguish which messages were actually
received from the bus, and which were loopbacked? I guess I should
open two sockets, one for loopbacks+RX, other just for RX?
You just need 1 socket.

See $ git show 1e55659ce6ddb5247cee0b1f720d77a799902b85

MSG_DONTROUTE is set for any packet from localhost,
MSG_CONFIRM is set for any pakcet of your socket.
Post by Pavel Kirienko
2. If an application closes its socket while there are still some
messages to transmit from the TX queue, will these untransmitted
messages be discarded? Or will they wait regardless of the socket's
life cycle? I am asking because it would be nice if there is some way
to drop the transmitting queue entirely in case of bus-off condition.
The messages that were sent with sendto/send/...
have been queued, and cannot be recalled :-(
Post by Pavel Kirienko
Best regards,
Pavel.
Post by Kurt Van Dijck
Hi Pavel,
I believe the queue filling can be solved by reading your own
messages back. As long as you see your own messages, your queue
remains empty.
When your messages is not yet read back, you must assume it did not
get out yet, so it's still in the queue. No need to resend, because
the previous one is yet pending.
When you receive a bus-off indication, you should reset this mechanism.
This address problems 1 & 3.
Problem 2 (old data) remains for 1 transmission period.
int recv_own_msgs = 1; /* 0 = disabled (default), 1 = enabled */
setsockopt(sock, SOL_CAN_RAW, CAN_RAW_RECV_OWN_MSGS,
&recv_own_msgs, sizeof(recv_own_msgs));
Kind regards,
Kurt
Post by Pavel Kirienko
Hi list,
There is a Linux-powered device connected to a CAN bus. The device
periodically transmits the CAN message. The nature of the data carried
by this message is like measurement rather than command, i.e. only the
most recent one is actually valid, and if some messages are lost that
is not an issue as long as the latest one was received successfully.
Then the device in question is being disconnected from the CAN bus for
some amount of time that is much longer than the interval between
subsequent message transmissions. The device logic is still trying to
transmit the messages, but since the bus is disconnected the CAN
controller is unable to transmit any of them so the messages are being
accumulated in the TX queue.
Some time later the CAN bus connection is restored, and all the
accumulated messages are being kicked on the bus one by one.
--- Problem ---
1. When the CAN bus connection is restored, undefined amount of
outdated messages will be transmitted from the TX queue.
2. While the CAN bus connection is still not available but TX queue is
already full, transmission of some most recent messages (i.e. the only
valid messages) will be discarded.
3. Once the CAN bus connection is restored, there would be short term
traffic burst while the TX queue is being flushed. This can alter the
Time Triggered Bus Scheduling if one is used (it is in my case).
--- Question ---
My application uses SocketCAN driver, so basically the question should
be applied to SocketCAN, but other options are considered too if there
are any.
I see two possible solutions: define a message transmission timeout
(if a message was not transmitted during some predefined amount if
time, it will be discarded automatically), or abort transmission of
outdated messages manually (though I doubt it is possible at all with
socket API).
1. How does one define TX timeout for CAN interface under Linux?
2. Are there other options exist to solve the problems described
above, aside from TX timeouts?
Thanks in advance,
Pavel Kirienko.
P.S. reposted from http://stackoverflow.com/questions/19633015
--
To unsubscribe from this list: send the line "unsubscribe linux-can" in
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Kurt Van Dijck
GRAMMER EiA ELECTRONICS
http://www.eia.be
+32-38708534
--
To unsubscribe from this list: send the line "unsubscribe linux-can" in
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Kurt Van Dijck
GRAMMER EiA ELECTRONICS
http://www.eia.be
***@eia.be
+32-38708534
--
To unsubscribe from this list: send the line "unsubscribe linux-can" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Pavel Pisa
2013-10-29 00:36:26 UTC
Permalink
Hello Pavel,
Post by Pavel Kirienko
There is a Linux-powered device connected to a CAN bus. The device
periodically transmits the CAN message. The nature of the data carrie=
d
Post by Pavel Kirienko
by this message is like measurement rather than command, i.e. only th=
e
Post by Pavel Kirienko
most recent one is actually valid, and if some messages are lost that
is not an issue as long as the latest one was received successfully.
you can achieve the required behavior (keep only the most recent messag=
e
in the queue) by use of Linux socket implementation QoS mechanism.

Look at paragraph "3.1.3. p=EF=AC=81fo_head_drop" our report

SocketCAN and queueing disciplines
http://rtime.felk.cvut.cz/can/socketcan-qdisc-final.pdf

You probably do not want to drop all send CAN messages
except the last one when different IDs are sent.
The canid ematch classifier is your friend in this case

3.2.3. The canid ematch

If you have reasonably recent Linux kernel (v3.6+) and the
corresponding iproute2 package (v3.6.0+) then all required
support is included.

You need something like our prio_0 setup used for our timing report.

I am at conference now so I do not have access to exact
setup example for that case. So I am sending copy
to Rostislav Lisovy who was main developer of ematch canid
filter/classifier.

But generally something like next example (for sure not
fully correct, you need to somehow specify pfifo_fast
for other traffic)=20

tc filter add dev can0 parent 1:0 basic \
match canid\(sff 0x123 sff 0x500:0x700 eff 0x00:0xff\) \
flowid 1:1

tc qdisc add dev eth0 parent 1:1 handle 10 pfifo_head_drop limit 1

If you have older kernel and cannot upgrade then you can try
to figure out how to use u32 ematch classifier. You need to
switch CAN ID endianness for little-endian systems when
you specify it for u32 (it matches network order - big endian).

tc filter add dev can0 parent 1:0 basic \
u32 match u32 0x01000000 0xffffffff at 0 flowid 1:1

tc qdisc add dev eth0 parent 1:1 handle 10 pfifo_head_drop limit 1


Best vishes,

Pavel

--
Pavel Pisa
e-mail: ***@cmp.felk.cvut.cz
www: http://cmp.felk.cvut.cz/~pisa
university: http://dce.fel.cvut.cz/
--
To unsubscribe from this list: send the line "unsubscribe linux-can" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Continue reading on narkive:
Loading...