SubT Virtual Tunnel Circuit
Robotika Team
The DARPA Subterranean Challenge Virtual Tunnel Circuit was postponed from August till October 2019, so we decided to split the original Circuit Tunnel report into two. Update: 3/11/2019 — Robotika Virtual Videos
There are couple paragraphs related to Virtual Tunnel Circuit in the
older report, in particular:
- Virtual Track - starting from 9th position ... (64 days) (12th June 2019)
- Freyja and Virtual Nightmare ... (21/35 days) (11th July 2019)
- Virtual too easy? Let's introduce launch changes ... (11+7/25 days) (21st July 2019)
- DARPA call, Freyja upside down, CYNET.ai (9+7/23 days) (23rd July 2019)
- Virtual Tunnel Circuit extended by two months! (8+7+60/23 days) (24th July 2019)
Content
- Restart (38 days)
- Accelerometer and local planner (36 days)
- Stone Age (34 days)
- Everything under control! (33 days)
- 250Hz IMU bottleneck(31 days)
- Robotika is leading the board! (29 days)
- Once upon a time ... (21 days)
- Fake it until you make it ... (18 days)
- Double Fake! (17 days)
- Ver1 = DRILL (16 days)
- Ver2 = ROS_MASTER_URI and ROS_PI (15 days)
- Ver3 = std2ros and ASCII art (14 days)
- Ver4 = wait_for_sensors and 0 points (13 days)
- Invisible Enemy = Ver5, 6, 6b, 7, 8, 9, 10? (12 days)
- ZeroMQ = Ver10, Ver11? Finally 2 points! (10 days)
- Ver13 = Two Robots (9 days)
- Ver14 = HD camera, debug_artf and log strip (8 days)
- Ver15? Red Noise (7 days)
- Ver16=another ASCII Art, Ver17=Comeback robot/refactoring (6 days)
- Ver18=Comeback robot again (5 days)
- Ver20 = Too many origins (4 days)
- Ver21 = base64 stdout logging (3 days)
- Ver22 = Hlava 22, hack log dump (2 days)
- No Silver Bullet, Ver23 = origin ERROR, rosout.log.2 (1 day)
- The Gods Themselves (0 days)
- The 5 stages DABDA or DARPA?
- 5 days after
- 11 days after
- Robotika Virtual Presentation
- Tunnel Circuit Awards Ceremony
- Robotika Wins 100000USD!
- Robotika Virtual Videos
1st September 2019 — Restart (38 days)
The System Track of Tunnel
Circuit is over, but not the virtual:
SubT Challenge Virtual Competition Teams,
We are excited to see your SubT Challenge innovations in the Virtual Tunnel
Circuit competition, and know your teams are hard at work testing and
developing novel solutions!
In accordance with our previous communications, the closing date for
submissions of solutions for the Virtual Tunnel Circuit is Wednesday, 09
October at 11:59PM AoE (https://en.wikipedia.org/wiki/Anywhere_on_Earth).
We highly encourage you to take advantage of the three Tunnel Practice
scenarios available for your use to test your entire solution workflow, to
include submission through the SubT Virtual Portal. Please be sure to test
early and often, and we continue to welcome your feedback to improve the
Virtual Competition experience.
Awards and winners will be announced on October 30th.
We appreciate your continued contributions to the SubT Challenge. We welcome
any questions you may have and we look forward to seeing your developments.
Sincerely,
The SubT Challenge Team
So what to do first?
Test first! …
it may work without you knowing it. I run the test before we left to Pittsburgh
but the log file was too big to play with:
-rw-rr1 gpu gpu 1.6G Aug 9 12:08 /home/gpu/subt-x2-190809_094511.log
so now there is a tool to strip down only necessary streams (see
PR #237). Filter to pose2d
and lidar scan was only 16MB
(subt-x2-190809_094511-strip.log)
… much better. I expected the X2 robot to hit the edges of start area but
somehow it recovered, even the entrance was visible via 2D lidar and the robot
entered the tunnel! But see the „map” after few traveled meters:
Somehow IMU is no longer working as it used to be?! So now I know the answer to
my original question „what to do first?” — investigate what happened to IMU
(due the transition from Gazebo 9 to Ignition Engine or because of recent
upgrades from system track)…
p.s. there is another rule which says !!!do not trust single measurement!!!
so I re-run the test:
./run.bash robotika tunnel_circuit_practice.ign robotName1:=X2 robotConfig1:=X2_SENSOR_CONFIG_4
where robotika docker image is no longer based on
nkoenig/subt-virtual-testbed (it is not available now??) but on new
osrf/subt-virtual-testbed described in updated tutorial. And guess what?
The robot did not reach the edge of starting area this time. Hmm. They said
that the position of robots is not going to be random … so what changed?
I still cannot understand how people can live (I mean develop robots)
without logfiles?! What they do in this situation?
All I need to do is to run:
python -m subt replay subt-x2-190901_192059.log
and it fails, because in branch feature/virtual-ignition I changed command
go straight 9 meters to 3 meters as this was in note in I made in
logfile last time … now everything is passing fine, … it is time to
download the large old logfile … done. Are you also curious?
It passes fine too! But …
SubT Challenge Ver2! None go_straight 3.0 (0, 0, 0) 0:00:00.170243 (0.0 0.0 0.0) [] 0:00:00.170243 [] maximum delay overshot: 0:00:00.120148 maximum delay overshot: 0:00:00.109536 maximum delay overshot: 0:00:00.128402 maximum delay overshot: 0:00:00.128848 maximum delay overshot: 0:00:00.107339 maximum delay overshot: 0:00:00.107657
so in the old image there was a delay more then 0.1s in the control loop, which
is a good news — it could be the reason of strange IMU behavior (not properly
synced with rest of the system), so maybe newer docker image is better,
faster?!
How long it took to analyze? 5 minutes? (yes, the slowest part was to copy
1.6GB from one machine to another)
So let's try 4 meters …
Well it will be a „lot of fun” — now it generated set of collisions and
X2 tried to recover. But it collided even with „tunnel air” when it tried to enter the tunnel.
Time to check accelerometers!
- the complete log file subt-x2-190901_195952.log (400MB)
- video how it ended: https://youtu.be/N1Om4LO73uE
3rd September 2019 — Accelerometer and local planner (36 days)
The progress yesterday was small, but at least
the
logfile confirmed source of false collision detection (see reported issue
#166).
Shortly, whenever robot X2 wants to go or stop its acceleration was larger
(after removal of G-force) then original threshold 12m/s^2. This is probably
bug, because the robot has limit 7m/s^2, but in meantime I increased the
threshold to 18m/s^2 so it hits the obstacles harder, probably.
After the post Wolfgang wrote me notice: You may already have noticed that
the angular velocity of X2 has changed again. The change in the last version of
the docker file leaded to silly dancing (turning right and turning left) of
the X2 robot in our tests :-( … hmm see already „resolved” issue
#161
Rotation Behavior Dependent on Robot Orientation … so based on robot
orientation the angular speed is different?? Sigh, system track was easy.
One of the problems with current code is accumulated delay from local
planner causing crash at the end. The Practice Tunnel 1 contains many
heaps of stones and boards and not all is visible on 2D lidar. Also there are
only narrow alleys remaining.
I also switched for convenience to X2 configuration #3 with QVGA camera to
speedup cycle test and reduce log size but then the first detected artifact is
not as nice:
The first detected artifact |
Yes, another reason was that even the JPEG images are sometime very large:
Connected by ('127.0.0.1', 54808) Exception in thread Thread-17: Traceback (most recent call last): File "/usr/lib/python3.6/threading.py", line 916, in _bootstrap_inner self.run() File "/home/developer/subt_ws/osgar/osgar/drivers/rosmsg.py", line 267, in run self.bus.publish('image', parse_jpeg_image(packet)) File "/home/developer/subt_ws/osgar/osgar/drivers/rosmsg.py", line 90, in parse_jpeg_image assert size < 230467, size # expected size for raw image during experiment AssertionError: 232056
so maybe the ROS and Ignition uses different loss factor? Time to remove this
old assert (used to be verification that I am receiving compressed images).
5th September 2019 — Stone Age (34 days)
It looks like the organizers like stones — they are everywhere :-(. And sure
enough they are not too big to be visible by 2D lidar:
It is possible to detect collision from accelerometers, but at the moment (I
mean current simulation docker) there are too many false alarms. I tried
practise tunnel 01 and 02 and very soon you approach the heap of stones … and
if you try virtual STIX you hit the „virtual wall” even before entering
the tunnel!
0:00:03.057975 go_straight 7.0 (speed: 0.5) (0.0, 0.0, -0.7923794804054256) 0:00:03.151615 Collision! [18.273, 0.004, 18.782] reported: False 0:00:03.176237 Collision! [25.47, 0.0, 19.8] reported: False 0:00:38.168127 Collision! [-19.87, -0.011, 21.035] reported: True
Sigh. There is one good news — so far the robot (only one, so I do not have
the problem #168 reported yesterday with deploying multiple robots yet) is
placed at approximately (-5, 5, 0), so it is enough to code:
self.turn(math.radians(-45)) self.go_straight(7.0)
to enter the tunnel. Yeah, I know that it is not very sophisticated, but first
I need to get the thing working and score at least one „fake” point on the
cloud simulation. Somehow this simple goal is still far far „in the
clouds”.
Maybe one more picture — attempt to enter virtual STIX tunnel:
6th September 2019 — Everything under control! (33 days)
p.s. well, maybe I should add some note, what you see and what you should see …
This video was generated from hacked
osgar.tools.lidarview
with extra few lines to
save
pygame screen:
pygame.image.save(screen, "tmp.jpg")
and OpenCV wrapper to save AVI file. I even uploaded the
subt-x2-190906_185628.log
(175MB) logfile, so everyone can replay it. Note, that the development is
currently in git
feature/virtual-ignition
branch. The lidarview is „universal” and it was used also for analysis of
OSGAR robots in Pittsburgh — this means the parameters could be bit
complicated so I added also simple helper subt/script/virtual-view.bat,
which contains only this line:
python -m osgar.tools.lidarview –pose2d app.pose2d –lidar rosmsg_laser.scan –camera rosmsg_image.image –keyframes detector.artf %1
So what you see? In the top left corner it ROS image from camera, green points
are ROS LIDAR scan points and finally orange is robot 2D pose with orientation
and its history/trace. The scale is in the left bottom corner (1 meter). Note
stream name app.pose2d, so it is already fused position from IMU and
odometry.
Now what you should see and you definitely do not … is that if you combine
laser scan with yaw angle from IMU on reasonable flat surface (this was
worldName:=simple_tunnel_01 added in
the
latest release 2019-09-05) the direction should be aligned! No wild
rotations!
What could be wrong? One obvious possibility is timing — you simply have
LIDAR scan and IMU yaw data from different times. So that is how I came to
issue #170 -
Irregular X2/odom — timing of IMU gyro and accelerometer is perfect but
odometry is two to four time slower. The OSGAR log file contains both: internal
streams (used in lidarview for example) as well as raw communication with
„external devices” (in this case ROS). You can replay processing of ROS
IMU:
python -m osgar.replay –module rosmsg_imu subt-x2-190906_185628.log
or ROS odometry:
python -m osgar.replay –module rosmsg_odom subt-x2-190906_185628.log
and print ROS timestamps in nanoseconds. For IMU you get differences 4000000ns,
which is 4ms … so the update is 250Hz? And Addisu Z. Taddese wrote, that
odometry should be on 50Hz, so I should have 5 times more messages from IMU
than from odometry … and I don't! Moreover the irregularity is really bad:
python -m subt replay e:\osgar\logs\subt2\virtual\subt-x2-190906_161345.log … 0:00:10.311060 acc 0:00:10.316274 rot 0:00:10.316339 acc 0:00:10.319772 rot 0:00:10.319824 acc 0:00:10.321231 pose2d 0:00:10.321453 pose2d 0:00:10.321734 pose2d 0:00:10.323033 rot 0:00:10.323096 acc 0:00:10.326389 rot 0:00:10.326485 acc 0:00:10.331349 rot 0:00:10.331419 acc
so here are 3 times pose2d messages in row and none accelerometer or rotation
in between. Sigh. I think the cause is clear — it is necessary to down-sample
IMU to „reasonable” 50Hz and reduce the internal traffic. Ideally I would
have „block update” of all sensors, but from ROS I have only independent
TCP connections …
8th September 2019 — 250Hz IMU bottleneck(31 days)
OK, the video from simple_tunnel_01 was not that funny … but it exposed
interesting case
(#172): the
simulation could be fast enough that the bottleneck becomes the TCP
communication! In my case it is IMU, which sends updates on 250Hz. It is delayed
and this is especially visible on the unit test tunnels, i.e. fast to
simulate. What happened was that in 2 minutes of simulation my IMU data were 10
seconds old! Somehow pose2d from normal odometry is much more reliable then
the position computed from IMU:
What you see above is that when the robot turned by 180 degrees IMU did not
notice it for another 10 seconds (the robot continued straight) so that is why
the map is so bad.
What to do if your orientation is 10s old? You can stop and wait 10s.
Unfortunately the buffer is not flushed, so it will be still 10s old. We can
use the „fall-back” to version 0 — turn 180 degrees and follow the other
wall for already traveled distance. With this strategy we could at least return
to the tunnel entrance (simple_tunnel_01). It worked for speed 0.5m/s, but
did not work for 2m/s. Maybe if I disable LocalPlanner again … ???
Attempt with tunnel_circuit_practice_01 ended up on the robot's roof for
speed 1m/s, and it looks like slow down to 0.5m/s does not help too.
p.s. well, it is not only the 10s old orientation, it is also info, that 10s
ago you probably hit something … no wonder robot is upside down :-(
10th September 2019 — Robotika is leading the board! (29 days)
(with zero points, practise tunnel 01)
There is new design of SubT Challenge
Virtual Testbed Portal and also the
Leader Boards are active now! And
believe it or not Robotika is on the 1st place! … yeah, just kidding, … almost …
I wanted to make some bad jokes about PR article
Autonomní
roboti z ČVUT vyhráli v USA soutěž pořádanou agenturou DARPA (Autonomous
robots from CTU won the U.S. contest organized by DARPA). I am glad that they
are more proud to beat us then to be on the 3rd place next to Carnegie Mellon
and NASA/JPL/Caltech/MIT. … so we are winning similarly as we were
loosing with 8th points on the last position in Tunnel Qualification.
This time it is 0 points, just proof that we tried. Potentially a long way to
score at least one point!
18th September 2019 — Once upon a time … (21 days)
Believe it or not, I still did not manage to get a single point after a week of
experiments! Yes, it is true, that in meantime we organized
Robotour in Deggendorf, but still …
something is very wrong and I am sure that it is some very stupid detail. :-(
See issue
#186.
So where are we? First of all we have back IMU at 50Hz
(#172), and it is
locally as nice as it used to be. Yeah, I still have to review the
Collisions detected by accelerometer if they can be reduced, but later. The
main goal is to get 1 point, version 0, as always.
What I know so far? Nate mentioned important „detail”: The robot must get
close enough to an artifact for it to be loaded before you can attempt to score
with the artifact. You can see parts of the tunnel loading as a robot moves
through the environment. Attempting to report an artifact that was never loaded
will fail. An artifact only has to be loaded once. … so that's why debug
routine just entering the tunnel and reporting previously detected (from older
simulation) artifact did not work — so there is no such a fast „development
cycle”. You have to wait couple of minutes to see that it failed again. OK, it
definitely makes sense for the competition.
Then there is an issue with the keyword defining Base Station, it used to
be BaseStation and now it is base_station. Recommended solution is
subt::kBaseStationName.
I also reviewed the artifacts — they are different when compared to
qualification times and also re-indexed.
But there are people trying to help me. Arthur directed me to
subt_hello_world
development branch and Alfredo added three terminal solution:
Terminal 1
$ mkdir -p ~/subt_testbed && cd ~/subt_testbed $ wget https://bitbucket.org/osrf/subt/raw/tunnel_circuit/docker/run.bash $ chmod +x run.bash $ ./run.bash osrf/subt-virtual-testbed:cloudsim_sim_latest cloudsim_sim.ign robotName1:=X1 robotConfig1:=X1_SENSOR_CONFIG1
Terminal 2
$ cd ~/subt_testbed/ $ ./run.bash osrf/subt-virtual-testbed:cloudsim_bridge_latest robotName1:=X1 robotConfig1:=X1_SENSOR_CONFIG1
Terminal 3
$ cd ~/subt_testbed/ $ ./run.bash osrf/subt-virtual-testbed:subt_solution_latest
… and it worked in the sense that it reported artifact:
[ INFO] [1568782543.383065601]: Starting seed competitor [ INFO] [1568782543.398386348]: Waiting for /clock, /subt/start, and /subt/pose_from_artifact [ INFO] [1568782543.639311235, 47.808000000]: Using robot name[X1] [ INFO] [1568782543.781596453, 47.832000000]: Sent start signal. [ INFO] [1568782565.031902152, 59.608000000]: Arrived at entrance! [ INFO] [1568782565.269722166, 59.708000000]: Message from [base_station] to [X1] on port [4100]: [report_id: 1 artifact { type: 4 pose { position { x: -8.1 y: 37 z: 0.004 } } } submitted_datetime { sec: 4798503 nsec: 310287108 } sim_time { sec: 59 nsec: 608000000 } run: 1 report_status: "scored" score_change: 1 ]
but to be honest I do not have a clue how this is done, where is the source, even which tunnel is this?
Do you see the QVGA artifacts?
21st September 2019 — Fake it until you make it … (18 days)
... but it will not help you anyway!
The plan last night was simple: insert our simple tunnel exploration for 2
minutes before subt_seed example and it should be enough to score 1 point
as it worked locally:
run: 1 report_status: "scored" score_change: 1
but not on AWS (Amazon Web Services):
Hello md, Your Tunnel Practice 1 simulation with id 15de7ba1-e825-4f39-b8d2-34f427e8416c has finished successfully. Summary { "sim_time_duration_avg": 0, "sim_time_duration_std_dev": 0, "real_time_duration_avg": 0, "real_time_duration_std_dev": 0, "model_count_avg": 0, "model_count_std_dev": 0 } Regards, Open Robotics Team
You can even download ROS log, which looks very promising, except it was almost empty:
empty log ROS files |
Now my impression is that I have to „bundle” the bridge to my docker, maybe
… so far still 0 points. :-(
22nd September 2019 — Double Fake! (17 days)
If fake does not work then double it! I wonder how the simulation of
seed2 ends on AWS — it is not necessary „fake attempt” to score 2
points on Tunnel Practice 01 Leader
Board, but to have a proof (?) that scoring behaves differently then expected
(#186).
Shortly you do not have to explore the tunnel, but … maybe this „feature”
is intended for the final Tunnel Circuit world??
What I did was that I guessed position of Backpack artifact from this
trace:
The first attempt failed but then the more proper measurement reduced the
position error and it passed. The docker image seed2 uses original code but
reports two artifact … and it should fail because they are reported in the
tunnel entrance without any exploration. We will see.
On the other hand I have also a good news — with some luck (but repeated
twice) X2 managed to navigate among the stones to the first Fire
Extinguisher and report its valid position.
Now there is only a small detail to put these two parts together and
get some real points in AWS simulation. Here is the logfile
subt-x2-190922_182354.log
(213MB) and video (QVGA, X2 config #3, sorry).
python3 -m subt run subt/subt-x2.json –side left –timeout 128 –walldist 0.9
23rd September 2019 — Ver1 = DRILL (16 days)
Today (45min to midnight) update will be short. The fake from yesterday scored
2 points, but not much excitement there (yeah, we are on the top but it will
not work anywhere else). I did some tests in practice tunnel 02 and the
robot turns around at the first blockage:
For some parameters it tries to enter the smaller hole on the left, but then it
turns 180 degrees or for larger parameters it turns immediately.
I wondered why Kevin/Coordinated Robotics selected Simple Tunnel 02 for the
deployment test and now I know — it is nice tunnel with artefacts near the
entrance. There is drill and survivor close to the start:
Drill |
Survivor |
The detectors are not tweaked yet, so Survivor is not recognized at all and
Drill was reported as Valve (small red thing). Fixed at least the drill
… I should say hacked. The same holds for camera resolution — half a year
ago we used large resolution camera, now I am lazy to copy larger log files
so I use QVGA instead (for now). Logfile is here:
subt-x2-190923_193000.log
(286MB), and QVGA video is here.
The simulation with ver1 is running now, but I guess it will fail again …
24th September 2019 — Ver2 = ROS_MASTER_URI and ROS_PI (15 days)
Well, there was a micro progress
(#199)
that docker on AWS does not use local IP address 127.0.0.1. So now there is
at least indication that osgar tried to subscribe and publish basic
topics:
[rosmaster.master][INFO] 2019-09-24 10:05:48,127: +SUB [/X2/imu/data] /osgar_node http://10.36.176.3:8000 [rosmaster.master][INFO] 2019-09-24 10:05:48,144: +SUB [/X2/front_scan] /osgar_node http://10.36.176.3:8000 [rosmaster.master][INFO] 2019-09-24 10:05:48,160: +SUB [/X2/front/image_raw/compressed] /osgar_node http://10.36.176.3:8000 [rosmaster.master][INFO] 2019-09-24 10:05:48,177: +SUB [/X2/odom] /osgar_node http://10.36.176.3:8000 [rosmaster.master][INFO] 2019-09-24 10:05:48,192: +PUB [/X2/cmd_vel] /osgar_node http://10.36.176.3:8000 [rosmaster.master][INFO] 2019-09-24 10:05:48,220: publisherUpdate[/X2/cmd_vel] -> http://10.36.176.2:33337/ ['http://10.36.176.3:43523/', 'http://10.36.176.3:8000'] [rosmaster.master][INFO] 2019-09-24 10:05:48,221: publisherUpdate[/X2/cmd_vel] -> http://10.36.176.2:33337/ ['http://10.36.176.3:43523/', 'http://10.36.176.3:8000']: sec=0.00, result=[1, '', 0]
… but that is all :-(.
I am not sure if it is OK that all four teams in the simplest environment have 0 points?!
25th September 2019 — Ver3 = std2ros and ASCII art (14 days)
Another run failed, again … so let's see if any of prepared „traps” contains
anything. The changes were:
- print robot position via ROS_INFO from the modified subt_seed example
- remove original publish of topic cmdVel to avoid collision in commands
- use simple std2ros.py to re-direct out Python3 stdout to ROS:
import sys import rospy rospy.init_node('mdtalker', anonymous=True) for i, line in enumerate(sys.stdin): rospy.loginfo(str(i) + ': ' + line.strip())
Will it work? Or all problems are on stderr?
Before I open „the treasure box”
(ver3-76640b06-334f-47eb-8dd9-b09202a14ca8.tar.gz and
ver3-76640b06-334f-47eb-8dd9-b09202a14ca8-X2.tar.gz) I would mention some
fun I had today. I am running simulations remotely (I mean really remotely, via
web, RasPI, MapFactor, GPU sever, docker) and what I have most of the time is
only CLI (command line interface). I tried to run 5 minutes simulation of X2
at speed 1m/s and … it was hard to guess what the robot explored. So I
wrote in 15 minutes small osgar.tools.asciiart
python -m osgar.tools.asciiart subt-x2-190925_104718.log –step 3 XXXXXXXXXXXX XX XX X XX X X X X X X X X X X X X X X X X X X XXXXXXXXXXXXX X XXXXXXXXXXXX X XX X X XX X X X X X X X X X X X X X X X X X X X XX XX X XXXXXXXXXXXXX
It looks like Simple Tunnel 02 is relatively small circle
OK, open the box!
[rosmaster.master][INFO] 2019-09-25 09:00:13,508: +SUB [/X2/imu/data] /osgar_node http://10.37.192.3:8000 09:00:13,527: +SUB [/X2/front_scan] /osgar_node http://10.37.192.3:8000 09:00:13,542: +SUB [/X2/front/image_raw/compressed] /osgar_node http://10.37.192.3:8000 09:00:13,562: +SUB [/X2/odom] /osgar_node http://10.37.192.3:8000 09:00:13,577: +PUB [/X2/cmd_vel] /osgar_node http://10.37.192.3:8000 09:00:13,590: publisherUpdate[/X2/cmd_vel] -> http://10.37.192.2:36357/ ['http://10.37.192.3:8000'] 09:00:13,603: +PUB [/rosout] /mdtalker_72_1569427213447 http://10.37.192.3:42793/ 09:00:13,605: publisherUpdate[/X2/cmd_vel] -> http://10.37.192.2:36357/ ['http://10.37.192.3:8000']: sec=0.02, result=[1, '', 0] 09:00:13,605: publisherUpdate[/rosout] -> http://10.37.192.2:44631/ ['http://10.37.192.2:34241/', 'http://10.37.192.2:41203/', 'http://10.37.192.2:36399/', 'http://10.37.192.2:42761/', 'http://10.37.192.2:46663/', 'http://10.37.192.2:36357/', 'http://10.37.192.2:43891/', 'http://10.37.192.2:39637/', 'http://10.37.192.2:37481/', 'http://10.37.192.2:44531/', 'http://10.37.192.2:41603/', 'http://10.37.192.2:46139/', 'http://10.37.192.3:33947/', 'http://10.37.192.3:42793/'] 09:00:13,607: publisherUpdate[/rosout] -> http://10.37.192.2:44631/ ['http://10.37.192.2:34241/', 'http://10.37.192.2:41203/', 'http://10.37.192.2:36399/', 'http://10.37.192.2:42761/', 'http://10.37.192.2:46663/', 'http://10.37.192.2:36357/', 'http://10.37.192.2:43891/', 'http://10.37.192.2:39637/', 'http://10.37.192.2:37481/', 'http://10.37.192.2:44531/', 'http://10.37.192.2:41603/', 'http://10.37.192.2:46139/', 'http://10.37.192.3:33947/', 'http://10.37.192.3:42793/']: sec=0.00, result=[1, '', 0] 09:00:13,609: +SUB [/clock] /mdtalker_72_1569427213447 http://10.37.192.3:42793/ 09:00:13,611: +SERVICE [/mdtalker_72_1569427213447/get_loggers] /mdtalker_72_1569427213447 http://10.37.192.3:42793/ 09:00:13,613: +SERVICE [/mdtalker_72_1569427213447/set_logger_level] /mdtalker_72_1569427213447 http://10.37.192.3:42793/
Isn't it a problem that osgar_node does not have /rosout? Otherwise
I see both osgar_node as well as mdtalker, so let's continue …
rosout.log has 8MB and it looks interesting!
0.000000000 INFO /subt_solution [/home/developer/subt_solution/src/subt_seed/ src/subt_seed_node.cc:103(Controller)] [topics: /rosout] Waiting for /clock, /subt/start, and /subt/pose_from_artifact 375.900000000 INFO /subt_solution [/home/developer/subt_solution/src/subt_seed/ src/subt_seed_node.cc:111(Controller)] [topics: /rosout] Using robot name[X2]
Whooow! It took more then 6 minutes before it started?!
(#202)
375.908000000 INFO /subt_solution [/home/developer/subt_solution/src/subt_seed/ src/subt_seed_node.cc:146(Update)] [topics: /rosout] Sent start signal. 375.944000000 INFO /subt_solution [/home/developer/subt_solution/src/subt_seed/ src/subt_seed_node.cc:196(Update)] [topics: /rosout] MD robot pose -6 5 dist=61
Another one:
389.700000000 INFO /subt_solution [/home/developer/subt_solution/src/subt_seed/ src/subt_seed_node.cc:196(Update)] [topics: /rosout] MD robot pose -6 5.00004 dist=61.0004
Well, it is
double dist = pose.position.x * pose.position.x + pose.position.y * pose.position.y;
so it is actually squared distance!
5421.404000000 INFO /subt_solution [/home/developer/subt_solution/src/subt_seed/ src/subt_seed_node.cc:196(Update)] [topics: /rosout] MD robot pose -5.98597 4.99567 dist=60.7885
so the robot did not move much, but surprisingly the position was not stable?
No message from mdtalker :-(
p.s. the team Scientific Systems Company has 3 points and it is the leader again …
p.s.2 the osgar.tools.lidarview map looks much nicer then the ASCII art above …
26th September 2019 — Ver4 = wait_for_sensors and 0 points (13 days)
Bad, bad, bad :-(
Yesterday, during our regular SubT telco was created a small script
wait_for_sensors.py,
which is based on Python2 ROS and waits for all necessary sensors. If you
expect that it helped then you are wrong. BTW note 5 teams in Simple Tunnel
02, the deadline is approaching …
ROS master log (filter for listener):
+PUB [/rosout] /listener_51_1569448646286 http://10.36.32.3:39067/ +SUB [/clock] /listener_51_1569448646286 http://10.36.32.3:39067/ +SERVICE [/listener_51_1569448646286/get_loggers] /listener_51_1569448646286 http://10.36.32.3:39067/ +SERVICE [/listener_51_1569448646286/set_logger_level] /listener_51_1569448646286 http://10.36.32.3:39067/ +SUB [/X2/imu/data] /listener_51_1569448646286 http://10.36.32.3:39067/ +SUB [/X2/front_scan] /listener_51_1569448646286 http://10.36.32.3:39067/ +SUB [/X2/front/image_raw/compressed] /listener_51_1569448646286 http://10.36.32.3:39067/ +SUB [/X2/odom] /listener_51_1569448646286 http://10.36.32.3:39067/ -SUB [/clock] /listener_51_1569448646286 http://10.36.32.3:39067/ -SUB [/X2/front/image_raw/compressed] /listener_51_1569448646286 http://10.36.32.3:39067/ -SUB [/X2/front_scan] /listener_51_1569448646286 http://10.36.32.3:39067/ -SUB [/X2/imu/data] /listener_51_1569448646286 http://10.36.32.3:39067/ -SUB [/X2/odom] /listener_51_1569448646286 http://10.36.32.3:39067/ -PUB [/rosout] /listener_51_1569448646286 http://10.36.32.3:39067/ -SERVICE [/listener_51_1569448646286/get_loggers] /listener_51_1569448646286 rosrpc://10.36.32.3:43559 -SERVICE [/listener_51_1569448646286/set_logger_level] /listener_51_1569448646286 rosrpc://10.36.32.3:43559
… now I guess that + is creation and - removal of published, subscribed or queried topics. This looks OK.
ROS master log for mdtalker (utility forwarding stdout to ROS)
+PUB [/rosout] /mdtalker_74_1569448647072 http://10.36.32.3:36669/ +SUB [/clock] /mdtalker_74_1569448647072 http://10.36.32.3:36669/ +SERVICE [/mdtalker_74_1569448647072/get_loggers] /mdtalker_74_1569448647072 http://10.36.32.3:36669/ +SERVICE [/mdtalker_74_1569448647072/set_logger_level] /mdtalker_74_1569448647072 http://10.36.32.3:36669/
… this also looks OK, request output topic and ask about verbosity level.
Finally osgar_node:
+SUB [/X2/imu/data] /osgar_node http://10.36.32.3:8000 +SUB [/X2/front_scan] /osgar_node http://10.36.32.3:8000 +SUB [/X2/front/image_raw/compressed] /osgar_node http://10.36.32.3:8000 +SUB [/X2/odom] /osgar_node http://10.36.32.3:8000 +PUB [/X2/cmd_vel] /osgar_node http://10.36.32.3:8000
… good as expected, without any termination, so the Python3 program was probably killed with docker?
Empty roslaunch-sim-f94997ad-ae3a-4260-9a71-4835c0196b6d-comms-rbt1-52.log.
0.000000000 Node Startup 0.000000000 INFO /tf_world_static [/tmp/binarydeb/ros-melodic-tf2-ros-0.6.5/ src/static_transform_broadcaster_program.cpp:142(main)] [topics: /rosout, /tf_static] Spinning until killed publishing world to simple_tunnel_02 0.000000000 INFO /mdtalker_74_1569448647072 [std2ros.py:10(std2ros)] [topics: /clock, /rosout]--—- BEGIN--—- 416.060000000 INFO /X2/ros1_ign_bridge_twist [/tmp/ros1-ign-bridge-release/include/ ros1_ign_bridge/factory.hpp:137(Factory::ros1_callback)] [topics: /rosout] Passing message from ROS1 geometry_msgs/Twist to Ignition ignition.msgs.Twist (showing msg only once per type 0.000000000 INFO /subt_solution [/home/developer/subt_solution/src/subt_seed/ src/subt_seed_node.cc:103(Controller)] [topics: /rosout] Waiting for /clock, /subt/start, and /subt/pose_from_artifact 416.556000000 INFO /subt_solution [/home/developer/subt_solution/src/ subt_seed/src/subt_seed_node.cc:111(Controller)] [topics: /rosout] Using robot name[X2]
Sigh … it looks like all ROS nodes have their internal local time and then
they are put into one rosout.log file without sorting.
There is no single record of listner (!!!), i.e. the intermediate
wait_for_sensors ROS code waiting for data via ROS_INFO.
(this is some indication that rewrite our code to Python2 ROS would not help much :-(
What is confusing on the other hand how could mdtalker report something if
starts after listener?!
27th September 2019 — Invisible Enemy = Ver5, 6, 6b, 7, 8, 9, 10? (12 days)
This is current status. Not much change on the Simple Tunnel 02/Leader
Board but you can see the right side — last 24 hours robotika released
not one but many docker images. Without any luck. So what is going on?
First of all I am convinced that there is a bug in the
simulation-bridge-solution triple docker (yeah, in every code is at least one
bug, right?). As a proof I consider run ver6 and with identical docker
ver6b. In order to handle 6 minutes initialization of the simulation
monster I wrote extra pure ROS code (see
wait_for_sensors.py)
which waits for each topic, i.e. for each sensor.
I must admit that I actually almost like the Python ROS code:
def wait_for(topic, topic_type): rospy.loginfo('Wait for ' + topic) rospy.wait_for_message(topic, topic_type) rospy.loginfo('Done with ' + topic) def wait_for_sensors(): rospy.init_node('mdwait', anonymous=True) wait_for_master() rospy.loginfo('--—- mdwait BEGIN--—-') wait_for('/X2/imu/data', Imu) wait_for('/X2/front_scan', LaserScan) wait_for('/X2/front/image_raw/compressed', CompressedImage) wait_for('/X2/odom', Odometry) wait_for('/X2/battery_state', BatteryState) rospy.loginfo('---mdwait END---')
… but it does not work and YOU tell me how you want to debug it?! And who
will fix it in the last few remaining days on the side of organizers? I suppose
that in ROS you can launch several nodes independently and that they do not
have to be specified in a single launch file??
roslaunch subt_seed x1.launch & cd osgar python subt/wait_for_sensors.py python3 -m subt run subt/subt-x2.json –side left –walldist 0.75 –timeout 180 –note "try to visit artifact and return home" 2>&1 | python subt/std2ros.py
This was the run_solution.bash code … so subt_seed starts first and
in parallel starts wait_for_sensors.py.
This was the end of "ver6":
396.284000000 INFO /mdwait_71_1569502355368 [wait_for_sensors.py:24(wait_for)] [topics: /clock, /rosout] Wait for /X2/front/image_raw/compressed
… is was the first and also the last mdwait rosout.log message!!!
Obviously in this situation is ROS_INFO useless if you do not get messages at
the beginning.
Just for comparion ver6b (identical docker, alternative run) had this output:
429.896000000 INFO /mdwait_71_1569527167188 [wait_for_sensors.py:24(wait_for)] [topics: /clock, /rosout] Wait for /X2/odom 429.996000000 INFO /mdwait_71_1569527167188 [wait_for_sensors.py:26(wait_for)] [topics: /clock, /rosout] Done with /X2/odom 429.996000000 INFO /mdwait_71_1569527167188 [wait_for_sensors.py:24(wait_for)] [topics: /clock, /rosout] Wait for /X2/battery_state 430.076000000 INFO /mdwait_71_1569527167188 [wait_for_sensors.py:26(wait_for)] [topics: /clock, /rosout] Done with /X2/battery_state 430.080000000 INFO /mdwait_71_1569527167188 [wait_for_sensors.py:38(wait_for_sensors)] [topics: /clock, /rosout]---mdwait END---431.580000000 INFO /X2/ros1_ign_bridge_twist [/tmp/ros1-ign-bridge-release/ include/ros1_ign_bridge/factory.hpp:137(Factory::ros1_callback)] [topics: /rosout] Passing message from ROS1 geometry_msgs/Twist to Ignition ignition.msgs.Twist (showing msg only once per type 0.000000000 INFO /mdtalker_92_1569527168132 [std2ros.py:10(std2ros)] [topics: /clock, /rosout] --—- BEGIN--—-
… so in this run it continued, got a message from each topic and even started
Python3 code, which had redirected output to std2ros … and that was
it.
So what to do if even this does not work? The proposal on forum was to put all
nodes into single launch file, but would it work (especially with later OSGAR
integration)? With PavelJ we discussed option to use
ZeroMQ to provide alternative proxy between ROS and
OSGAR code. What seems to work is single C++ ROS node (subt_seed_node).
Conclusion? I guess that the only solution, if we still want to use our OSGAR
solution, is to code proxy in C++ as single ROS node.
I must admit that I really hate ROS. … it took me several hours to dig
out how to serialize IMU ROS message. It changed between ROS versions and
because new code is automatically generated (as I found out later) then all I
have found was obsolete sensors_msg/Imu.h. Sigh. This is the solution:
void imuCallback(const sensor_msgs::Imu::ConstPtr& msg) { ros::SerializedMessage sm = ros::serialization::serializeMessage(*msg); zmq_send(g_responder, sm.buf.get(), sm.num_bytes, 0); }
On the other hand OSGAR Python code to interface ZeroMQ took say 15 minutes,
worked on Windows and then on Linux without problem. So now I have some
serialized IMU data and I (naively) hope, that it will be the same format what
I was getting over TCP from ROS.
29th September 2019 — ZeroMQ = Ver10, Ver11? Finally 2 points! (10 days)
Just short report — I dropped old ROS Proxy and replaced it by ZeroMQ
interface (two pipefiles). Again I was angry with inability to
deserialize ROS message geometry_msgs::Twist this time. Sigh, but this
time I throw it away and used just string with two float numbers speed
and angular_speed. Many rude words … if somebody is convinced that more
templates and more streams you add to send a few bytes … ROS is probably not
for me.
The simulation is running now. Again it was not catching up so I had to down
sample lidar updates to 10Hz, which is not good, but I really would like to see
at least one point in Simple Tunnel 02. Then it will be time for profiling,
maybe. Or fixes to distinguish fire extinguisher, drill and backpack. Or two
robots. Or dangerous tilt. Or integration of stones avoidance based on camera.
Or … tons of things … 10 days … easy. Let's go to sleep (after I complete
docker image, upload it and start simulation ver). You can
check the results yourselves.
p.s. the next morning …
… so next step is probably cleanup and sending stdout via ZeroMQ to
learn what is going on there anyway …
778.704000000 INFO /subt_solution [/home/developer/subt_solution/src /subt_seed/src/subt_seed_node.cc:232 (CommClientCallback)] [topics: /rosout, /X2/cmd_vel] Message from [base_station] to [X2] on port [4100]: [report_id: 1 artifact { type: 2 pose { position { x: 22.79 y: 2.76 } } } submitted_datetime { sec: 959 nsec: 415188345 } sim_time { sec: 778 nsec: 540000000 } run: 1 report_status: "scored" score_change: 1 ] 778.708000000 INFO /subt_solution [/home/developer/subt_solution/src /subt_seed/src/subt_seed_node.cc:232 (CommClientCallback)] [topics: /rosout, /X2/cmd_vel] Message from [base_station] to [X2] on port [4100]: [report_id: 2 artifact { type: 7 pose { position { x: 43.45 y: 42.83 } } } submitted_datetime { sec: 959 nsec: 416012212 } sim_time { sec: 778 nsec: 540000000 } run: 1 report_status: "scored" score_change: 1 ]
30th September 2019 — Ver13 = Two Robots (9 days)
Tonight I was testing combination of two robots — old good fellows X2L
and X2R, i.e. robot X2 following left and right wall. It was necessary to
extend the C++/Python3 communication for origin pose of the robot and after
discovery, that the second robot starts at:
Origin: [-6.0, 3.0, 0.0615] Use right wall: False
I started to use name to distinguish on which side the robot should move:
Origin: [-6.0, 3.0, 0.0615] X2R Use right wall: True
(originally I wanted to use if it is on the left or right side of the tunnel
center axis, but both robots start on the left side)
I am not sure about the results yet — both robots should detect drill in
the entrance tunnel and then X2R should fail to recognize extinguisher
… to be fixed tomorrow. X2L should work as former X2 except the
rotation to tunnel center and distance traveled is properly computed from robot
origin.
p.s. it happened to me couple times that there were two Python3 controllers
running in parallel and crashing the robot … I am kind of surprised that they
co-exists …
1st October 2019 — Ver14 = HD camera, debug_artf and log strip (8 days)
Today it is so far just maintenance — I switched to X2_SENSOR_CONFIG_4
with HD camera. I liked QVGA due to reasonably small log files, but it was hard
even for me to recognize the few-pixel-objects. :-(
So now even for 100 second exploration I get 1.7GB large log file (for each robot)
while for the QVGA it was around 200MB. In order to „survive” I am now trying
to combine osgar.tools.strip to remove all communication with ROS and
rosmsg.image stream (BTW the size of JPEG images is like double to original
Gazebo, but in both cases it is "quality 80", so I do not know what else
changed). On the other hand I introduced detector.debug_artf stream which
contains all pictures selected as potential artifact ready for classification.
That is the part I want to fix tonight, i.e. reliably distinguish fire
extinguisher from drill. It should be trivial but was not in QVGA so we
will see from logfiles with HD camera. The final logfile has 12MB, which could
be even e-mail attachment … so far so good.
python3 -m osgar.tools.strip zmq-subt-x2-191001_101528.log –remove receiver.raw rosmsg.image -rw-r -r- 1 developer developer 167M Oct 1 06:51 zmq-subt-x2-191001_044323.log -rw-r -r- 1 developer developer 1.6G Oct 1 07:05 zmq-subt-x2-191001_045510.log -rw-r -r- 1 developer developer 1.7G Oct 1 07:21 zmq-subt-x2-191001_051055.log -rw-r -r- 1 developer developer 12M Oct 1 15:08 zmq-subt-x2-191001_101528-strip.log -rw-r -r- 1 developer developer 1.7G Oct 1 12:26 zmq-subt-x2-191001_101528.log
I would also mention simulation speed change on AWS, when changing from one
robot to two. Originally it was faster then reality, for example for ver12
I got mail with simulation times:
{ "sim_time_duration_avg": 3600, "sim_time_duration_std_dev": 0, "real_time_duration_avg": 2067, "real_time_duration_std_dev": 0, "model_count_avg": 1, "model_count_std_dev": 0 }
i.e. 1 hour took around 35 minutes. For ver13 (two robots) it was closer to
reality:
{ "sim_time_duration_avg": 3600, "sim_time_duration_std_dev": 0, "real_time_duration_avg": 4020, "real_time_duration_std_dev": 0, "model_count_avg": 2, "model_count_std_dev": 0 }
i.e. that 1 hour took extra 7 minutes = good! Now I also have a guess what
model_count_avg means … it is number of deployed robots, right?
p.s. joke of the day — Is that with the latest Docker images from
yesterday? is a sentence I noticed in issue
#216 — WHAT?!
There has been
new
release and nobody knows about it?!
p.s.2 now we can run 10 simulations in parallel, see
#214
p.s.3 the GPU server just died today (disk full) so remote access failed
… here is bad documentation photo at least the minute of death is clear …
the clock stopped.
Disk Full = Dead |
p.s.4 Stripped logfiles with detector.debug_artf stream:
2nd October 2019 — Ver15? Red Noise (7 days)
It turned out that the artifact classification failed primarily due to red
noise — many red dots in the image which I originally have not noticed.
That was also reason (after machine recovery) why I did not publish ver14
last night.
The cleaning solution is relatively simple, just call erode with some suitable kernel
kernel = np.ones((5,5), np.uint8) img2 = img.copy() img2[mask] = (255, 255, 255) img2[not_mask] = (0, 0, 0) img2 = cv2.erode(img2, kernel, iterations=1)
I just do not want to call this on every image but only on some potentially
interesting. So filtering is postponed to the last moment when I have to
decide which type of artifact to report.
The red dots were so invisible (even with artifical color) that I had to list
them in text file and try black and white image for reference. Now as I think
about it the counting could be simplified and all old bounding boxes etc can be
also postponed to the last moment! . I should use "slow" blob detection on
the final image and then it should be easy, right?
BTW another „detail” difference between qualification Gazebo and competition
Ignition: the artifacts used to be much brighter (red > 100) but now it
would not work for threshold higher then 40 …
Today I also noticed twice that my robot X2R did not leave the start
position (0, 0, 0). Once it was due to typo in configuration (see
#217
X2_SENSOR_CONFIG4 vs. X2_SENSOR_CONFIG_4) but I am not sure about the
second attempt … maybe I was just not patient enough, because the right
robot is supposed to start 10 seconds later.
p.s. I made very wrong assumption that the red noise is negligible to the
object itself. It is not so the original implementation looking for the local
maxima as the best view was misleading. Now I use filter for all images
before counting red pixels. Moreover there was a bug, that I re-analyzed
not the best image but the image after the end of locality which was
typically empty! It was not enough — there were still some larger
particles, so it was necessary to select the largest blob only … fixed,
and should be released as ver15.
p.s.2 !!! F I N A L L Y !!!
Note, that this simulation was also much faster — I added call of
/subt/finish at the end and the run is reduced to 100 seconds of
exploration:
{ "sim_time_duration_avg": 246, "sim_time_duration_std_dev": 0, "real_time_duration_avg": 373, "real_time_duration_std_dev": 0, "model_count_avg": 2, "model_count_std_dev": 0 }
3rd October 2019 — Ver16=another ASCII Art, Ver17=Comeback robot/refactoring (6 days)
This morning I celebrated 3 points with a bit of fun. As there is no
reasonable way how to get binary data from the robot simulated on AWS to the
developer and the last night I was fighting with stupid bugs in artifact
detection, I decided to dump binary sub-images (masks) as another ASCII art.
Yes, going back to 80's — would you recognize this artifact?
306.388000000 INFO Python3: stdout …………………………..XXX…….. 306.388000000 INFO Python3: stdout …………………XXXXXXXXXXXXXXXXXXXXX. 306.388000000 INFO Python3: stdout …………….XXXXXXXXXXXXXXXXXXXXXXXXXXX 306.388000000 INFO Python3: stdout …………….XXXXXXXXXXXXXXXXXXXXXXXXXXX 306.388000000 INFO Python3: stdout …………XXX.XXXXXXXXXXXXXXXXXXXXXXXXXXX 306.388000000 INFO Python3: stdout …..XXXXXXXXXXXXXXXXXXXXXXXXX.XXXXXXXXXXXX 306.388000000 INFO Python3: stdout …XXXXXXXXXXXXXXXXXXXXXXX……..XXXXXXXXX 306.388000000 INFO Python3: stdout …XXXXXXXXXXXXXXXXXXXX…………XXXXXXXX 306.388000000 INFO Python3: stdout ..XXXXXXXXXXXXXXXXXXXXXXXX………XXXXXXXX 306.388000000 INFO Python3: stdout ..XXXXXXXXXXXXXXXXXXXXXXXX………XXXXXXXX 306.392000000 INFO Python3: stdout ..XXXXXXXXXXXXXXXXXXXXXXXXXXXXX..XXXXXXXXXX 306.392000000 INFO Python3: stdout ..XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 306.392000000 INFO Python3: stdout ..XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 306.392000000 INFO Python3: stdout ..XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 306.392000000 INFO Python3: stdout …XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 306.392000000 INFO Python3: stdout …XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 306.392000000 INFO Python3: stdout ….XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX. 306.392000000 INFO Python3: stdout …..XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX. 306.392000000 INFO Python3: stdout …..XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX.. 306.396000000 INFO Python3: stdout ………XXX.XXXXXXXXXXXXXXXXXXXXXXXXXX…. 306.396000000 INFO Python3: stdout …………..XXXXXXXXXXXXXXXXXXXXXXX…… 306.396000000 INFO Python3: stdout …………..XXXXXXXXXXXXXXXXXXXXX…….. 306.396000000 INFO Python3: stdout …………..XXXXXXXXXXXXXXXXXXXX……… 306.396000000 INFO Python3: stdout …………..XXXXXXXXXXXXXXXXXXXX……… 306.396000000 INFO Python3: stdout …………..XXXXXXXXXXXXXXXXXXX………. 306.396000000 INFO Python3: stdout …………..XXXXXXXXXXXXXXXXXXX………. 306.396000000 INFO Python3: stdout …………..XXXXXXXXXXXXXXXXXXX………. 306.396000000 INFO Python3: stdout …………..XXXXXXXXXXXXXXXXXXX………. 306.396000000 INFO Python3: stdout ………….XXXXXXXXXXXXXXXXXXXX………. 306.396000000 INFO Python3: stdout ………….XXXXXXXXXXXXXXXXXX………… 306.396000000 INFO Python3: stdout ………….XXXXXXXXXXXXXXXX………….. 306.396000000 INFO Python3: stdout …………XXXXXXXXXXXXXXX……………. 306.396000000 INFO Python3: stdout …………XXXXXXXXXXXXXXX……………. 306.396000000 INFO Python3: stdout …………XXXXXXXXXXXXXXX……………. 306.400000000 INFO Python3: stdout …………XXXXXXXXXXXXXXX……………. 306.400000000 INFO Python3: stdout ………..XXXXXXXXXXXXXXXX……………. 306.400000000 INFO Python3: stdout ………..XXXXXXXXXXXXXXXX……………. 306.400000000 INFO Python3: stdout ………..XXXXXXXXXXXXXXXX……………. 306.400000000 INFO Python3: stdout ………..XXXXXXXXXXXXXXXX……………. 306.400000000 INFO Python3: stdout ………..XXXXXXXXXXXXXXX…………….. 306.400000000 INFO Python3: stdout ………..XXXXXXXXXXXXXXX…………….. 306.400000000 INFO Python3: stdout ………..XXXXXXXXXXXXXXX…………….. 306.400000000 INFO Python3: stdout ……….XXXXXXXXXXXXXXX……………… 306.400000000 INFO Python3: stdout ……….XXXXXXXXXXXXXXX……………… 306.400000000 INFO Python3: stdout ……….XXXXXXXXXXXXXX………………. 306.400000000 INFO Python3: stdout ……….XXXXXXXXXXXXXX………………. 306.400000000 INFO Python3: stdout ……….XXXXXXXXXXXXXXX……………… 306.400000000 INFO Python3: stdout ……….XXXXXXXXXXXXXXX……………… 306.400000000 INFO Python3: stdout ………XXXXXXXXXXXXXXXX……………… 306.400000000 INFO Python3: stdout ……..XXXXXXXXXXXXXXXXXXXXXXX………… 306.400000000 INFO Python3: stdout …….XXXXXXXXXXXXXXXXXXXXXXXXX……….. 306.400000000 INFO Python3: stdout ….XXXXXXXXXXXXXXXXXXXXXXXXXXXXX………. 306.400000000 INFO Python3: stdout …XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX…… 306.400000000 INFO Python3: stdout …XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX….. 306.400000000 INFO Python3: stdout ..XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX…. 306.400000000 INFO Python3: stdout ..XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX…. 306.400000000 INFO Python3: stdout XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX… 306.400000000 INFO Python3: stdout XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX… 306.400000000 INFO Python3: stdout XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX… 306.400000000 INFO Python3: stdout XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX… 306.400000000 INFO Python3: stdout XXXXXXXXXXXXXX.XXXXXXXXXXXXXXXXXXXXXXXXX… 306.400000000 INFO Python3: stdout XXXXXXXXXXXX….XXXXXXXXXXXXXXXXXXXXXXXX… 306.400000000 INFO Python3: stdout XXXXXXXXX……..XXXXXXXXX.XXXXXXXXXX…… 306.400000000 INFO Python3: stdout .XXXX……………………………….. 306.400000000 INFO Python3: stdout .XXX………………………………… 306.400000000 INFO Python3: stdout 1558 43 65 1237 1280 0.6615384615384615 0.3570661896243292 306.400000000 INFO Python3: stdout h/w 1.5116279069767442 306.400000000 INFO Python3: stdout 0:05:01.869666 Relative position: 2175 -3487 4637 TYPE_DRILL
There is a stupid "…" web replacement so you do not see what I see, but I do not
have enough energy to fix it now :-(.
That was Ver16.
The plan for Ver17 is to allow robot multiple entries — up to now we used
maximal time for exploration and returned just in time to report all artifacts
at the Base Station. Now we have 1 hour for exploration and there is quite
high probability, that the robot fails somewhere in the deep dungeons. The plan
is to return more often with discovered artifacts and re-enter the tunnel. But
… it requires some refactoring, as the Python3 and old C++ ROS code used to
communicate via file containing list of detected artifacts:
developer@gpu-desktop:~/subt_solution/osgar$ cat call_base.txt TYPE_DRILL 19.32 1.52 0.06 TYPE_EXTINGUISHER 24.36 2.74 0.06 TYPE_DRILL 24.51 1.38 0.06
As soon as the C++ ROS code discovered this file, parsed it, report it and the
program terminated.
Also the query for position has to be re-triggered … hmm, there is probably
no other way around it.
4th October 2019 — Ver18=Comeback robot again (5 days)
No fun today. Ver17 failed, twice. The first run X2L robot scored 1
point and detected only drill. X2R recognized and localized both artifacts
but did not report the result at all — 0 points. Sigh. Re-run failed even to
download the log files from AWS.
Now I am practicing back-and-forth motion, the scenario is „encoded” in the
robot name. The first test was X0_10_20L. Wait 0 seconds then explore
tunnel for 10 seconds, return to the entrance and try for the second time 20
seconds. Worked after many iterations. The origin is specified as
PoseStamped
described as position + orientation as quaternion. Now is running
X0_40_100_300L which should explore Simple Tunnel 02 3 times and
always add one artifact. If it passed I may try
X0_40_100_300_40_100_300_40_100_300_40_100_300_40_100_300_40_100_300L.
p.s. I completely forgot to mention why Comeback robot … it was based on
movie Vratné láhve
and návratový typ …
p.s.2 So maybe there will be no X0_40_100_300L tonight …
p.s.3 released ver19 and code X0F40F100F300L
24 letters limit |
… well, double check the assumptions.
5th October 2019 — Ver20 = Too many origins (4 days)
Yesterday I made a mistake. Or two. The first one is related to the
/subt/pose_from_artifact — it is OK to query it when the robot is
initialized for the first time, so I know global offset between robot initial
local position (0, 0, 0) and the global (0, 0, 0) at the tunnel entrance. But it is
not OK when the robot returns and updates its origin, sigh. I still keep
robot positions in its local coordinates starting from (0, 0, 0) but the offset
is replaced by this new origin. So it would be better to keep original
origin.
I wanted to collect some more data, so I reduced the log files already during
recording (before I was removing streams receiver.raw and rosmsg.image,
which took like 98%) … but the behavior changed … the SIMULATION behavior
changed! The robot collided at the tunnel entrance as it tried to stop from
moving at 1m/s:
collision at entrance |
It was possible to repeat it 2/2, the accelerometer X value was -18.055m/s2 and the
threshold for collision was 18. So I changed the limit to 20.
The second mistake I probably made when I was uploading ver18
and later ver19 replacing "_" by "F" the last night …
But it is not easy to find out what went wrong … my guess is that I filled
in the form docker image as copy and paste of docker push
200670743174.dkr.ecr.us-east-1.amazonaws.com/subt/robotika:ver19 instead of
200670743174.dkr.ecr.us-east-1.amazonaws.com/subt/robotika:ver19 …
… and here are new debug outputs with multiple returns in Simple Tunnel 02:
- logfile: zmq-subt-x2-191005_172226.log (27MB)
- video: Virtual SubT Robot X0F40F100F400L
6th October 2019 — Ver21 = base64 stdout logging (3 days)
I could not sleep well this morning so I got up and started coding. What I had on
my mind was dump of log as base64 to ROS stdout. This would be crazy
before, but it became more reasonable with reduced logs. I wrote a simple OSGAR
driver stdin, which reads text lines, optionally adds line number and sends
it to the system. Then I plugged rosmsg and zeromq and it should stream
standard input all the way to ROS. Yes, I had to tweak record.py that it
will not record infinitely but only until last module finishes … ready to
test BUT:
martin@temp:~$ ssh -p 2200 gpu@localhost ssh: connect to host localhost port 2200: Connection refused martin@temp:~$
The remote machine is not available! F* … it looks like I will rest the rest
of Sunday! Last night I did test with simulation of four X2 robots, just to see
how it behaves, but there was enough space and it should be already terminated
… hmm, … bad, bad bad …
Here you can see at least some pictures of other X2 robots (they were
recorded as red artefact, i.e. another error to be fixed):
p.s. it looks like the bridge docker has some memory leak?? That was the only
docker I am aware I left running last night … and the machine was heavily
swapping now … hard shutdown.
7th October 2019 — Ver22 = Hlava 22, hack log dump (2 days)
Last night we failed, together with PavelJ, to get version with ZeroMQ to
re-enter originally opened pipeline and dump log file. Sigh. On the other
hand it worked if we were still running the basic navigation code … and it is
surely some stupid bug, as usual.
So it was time for even more desperate solution:
def dumplog(self): import os filename = self.bus.logger.filename # deep hack self.stdout("Dump Log:", filename) size = statinfo = os.stat(filename).st_size self.stdout("Size:", ) with open(filename, 'rb') as f: for i in range(0, size, 100): self.stdout(i, f.read(100)) self.stdout("Dump END")
… there was ""base64" for short moment, but then print of binary string
works, so maybe … later. I tried X0F40L and it dumped something! Note
that it is dumping OSGAR log file while it is still writing this output to the
same logfile — that is the motivation for file size at the beginning. Also
I am aware that the last 100bytes will not be complete … but I expect holes
inside so few bytes at the „end” do not matter much. And now I have
historically the very first log file from AWS!
Yeah, just to make sure I am able to decode it, right? But in mean time I
triggered 5 simulations with robots:
- A0F40F100F400L
- B10F40F100F400R
- C1200F40F100F400L
- D1220F40F100F400R
on docker 200670743174.dkr.ecr.us-east-1.amazonaws.com/subt/robotika:ver22.
Beside logging it provides fail-safe as in System Trac: if the pitch angle
is bigger then 20 degrees, try to return home a little and then continue with
original side of the wall.
There was another joke — I was able to trigger simulation for 5
robots! The credit system allows it (there are 1000 credits and robot X2 in
configuration 4 with lidar and HD camera „costs” only 180 credits):
{ "sim_time_duration_avg": 2532, "sim_time_duration_std_dev": 0, "real_time_duration_avg": 8057, "real_time_duration_std_dev": 0, "model_count_avg": 5, "model_count_std_dev": 0 }
Now I am anxious to see the first logfile recovered … so update probably
later today … here are some local simulations:
- zmq-subt-x2-191006_213808.log (27MB) — practice tunnel 01, sees extinguisher and backpack but fails on 2nd crossing
- zmq-subt-x2-191006_190653.log (18MB) — practice tunnel 03, old code, navigates through narrow passage and fails at 3rd crossing
- zmq-subt-x2-191006_211831.log (12MB) — practice tunnel 03, pitch limit 20deg, fails to enter tunnel on the first crossing
Do you see the artefact? Robot does …
p.s. the first observation:
0 356.072000000 INFO /subt_solution [/home/developer/subt_solution/src/subt_seed /src/subt_seed_node.cc:393(getSpeedCmd)] [topics: ] Python3: stdout Size:
… somebody forgot to print the value …
p.s.2 the log AWS/ROS file is not complete … it was pushed to OSGAR but not all
the way to ROS … i.e. wait for completion is necessary. Downloaded 3672700
bytes. But still, not too bad!
… so 48 hours to the end of competition I can start to work as I am used to
… great! The hacked conversion utility is here:
aws2log.py.
8th October 2019 — No Silver Bullet, Ver23 = origin ERROR, rosout.log.2 (1 day)
There is No Silver Bullet
which could lead now into major software progress „jump”. Ver23 contains
two bugfixes and one experiment:
- if request for robot origin fails, do not repeat query but send to Python3 origin ERROR message and pretend that you are already at the tunnel entrance
- if the article is in -20 < ax < 0 and -10 < ay < 10 do not report it. It is probably another X2 red robot
- the wall distance was changed from 0.9 meters to 0.5 meters so the robot at least tries to enter narrow stone passage in Practice Tunnel 02 where Kevin with drones already scored 2 points
And that is it.
Actually no, there are some other news, but no other code changes from
yesterday. The first note is regarding logged output. I was too excited
yesterday morning, but it did not last long (1 hour?) — the next
rosout.log was cut! I was very upset about it, extended issue
#204, even
wrote new code to be able to recover at least the last pieces of OSGAR logfile
… to find out later on, that it is indexed!, i.e. beside active
rosout.log there is also rosout.log.1 and sometimes rosout.log.2,
each 100MB long.
Another issue is with not properly starting instances or loooong time of their
removal (one simulation took 8 hours, where at least 2 hour were DeletingPods
With no errors
(#235).
And finally old competitors are slowly showing up in the Leader Board
tables. There are 7 in Simple Tunnel 02 (including sodium-24-robotics)
and CYNET-ai is working on submission
(#236).
One news from this morning … I see new report: TerminatingInstances Error:
AdminReview so maybe they finally accepted that I really use the same docker
just many times and it is strange, that it sometimes starts and sometimes does
not. More over it is reduced to single ROS C++ main code (single threaded)
based on their subt_seed example … sigh.
9th October 2019 — The Gods Themselves (0 days)
Isaac Asimov wrote nice novel
The Gods Themselves which
was translated also into Czech
Ani sami
bohové. I had on my mind the Czech translation which would be re-translated
back to English „Not Even The Gods Themselves” … you may not know about
what I am talking but you may guess
The simulation environment was upgraded yesterday. Yes, no kidding — see
the
release notes. The team Sophisticated Engineering just give up and
deployed their solution while the machines were down (see
241).
I submitted a small test (short, two robots in Simple Tunnel 02 … which
is not short, because it also tries to submit the OSGAR logs), it failed, was
reviewed by admins and restarted … it is not finished yet. Now we have the
next morning.
Sigh.
What I am testing is ver2505 and ver2509 — simply ver25 with
smaller and larger parameter walldist. These would be my notes for launch
… in the unlikely (?) case I will throw my notebook out of window, so somebody
from the team (PavelJ?) can take over … BTW another joke: PavelJ wanted to
check the images locally and that was the last thing he could do last night …
maybe he is still downloading now.
A10F2400L B60F1500R 200670743174.dkr.ecr.us-east-1.amazonaws.com/subt/robotika:ver2505 C120F100F400F1000L D400F100F400F1000R 200670743174.dkr.ecr.us-east-1.amazonaws.com/subt/robotika:ver2509
Final(?) strategy |
… as SusanB would say 20+ years ago Everything is going to be
alright, because Life is Good (R) …
Yes, it is. The Virtual Track/Tunnel Circuit ends today.
p.s. I liked this screenshot:
It was for the same robot name codes in Practice Tunnel 01, when only the
robot D was able to dump OSGAR log into rosout.log file. BTW robot C did
not start at all (see
#225).
10th October 2019 — The 5 stages DABDA or DARPA?
DABDA. … Denial; Anger; Bargaining; Depression; Acceptance.
It is not easy even to die … Error #5507: Circuit simulation submission limit reached
15th October 2019 — 5 days after
21st October 2019 — 11 days after
23rd October 2019 — Robotika Virtual Presentation
It is two weeks now after our final solution docker submission and one week to
the final DARPA announcement. Few days ago we received following mail:
Virtual Tunnel Circuit Awards Ceremony Information for October 30th
Virtual Competitors, Please join us for the DARPA SubT Challenge Virtual
Tunnel Circuit Awards Ceremony & Technical Interchange Meeting on Wednesday,
October 30 at 11:00 AM EDT. The program will be broadcast via DARPA’s
YouTube channel: DARPAtv and will
last about an hour. For those who are unable to view the live stream in real
time, the broadcast will be recorded and posted to DARPAtv at the conclusion of
the event.
As part of the program, we request that each virtual team uses the attached
template to create a 5-minute presentation with recorded narration for DARPA to
broadcast. We look forward to celebrating our virtual competitors!
Well, so here is our Robotika Virtual
Presentation. It was created by slightly poissonned mind … somehow the
two screenshots I posted 5 and 11 days after the final submission are still
relevant. They are interlaced every once a while.
If you think that my spoken English is much better now then you are wrong.
In order to reduce emotions I used robots to do the job and
MichaelV3 from IBM Watson was
the best. Useful overview link was
how to use text2speech
in Python.
I guess there will no update until the final announcement, so we will see.
At the moment I am aware of 3 potential fatal bugs:
1) OSGAR is generating motion commands on 10Hz timer. They are published to
ROS/Ignition while the CloudSim docker is still booting. I suppose that
ROS/Ignition should drop the commands as soon as the queue is full, but nobody
knows. This issue mentioned Arthur, but I do not know the conclusion yet.
2) ver25 uses laser scan differences to detect stuck robot and after 3 seconds
it generates Collision. This should be ignored at the beginning during turn
and straight motion towards tunnel entrance, but I am not 100% sure.
3) finally the expected tunnel slope was never tested in Ignition. During
qualification the tunnels were up and down, so they could trigger now return
home although the robot did not hit any stone. The released limit was 20
degrees as used in System Track.
So see you in a week on DARPAtv unless
some unexpected party happens …
p.s. if you ever encounter problems with Power Point and audio, replace mp3
format by wav (see
MS
forum). Yes, the original 3MB presentation with voice will grow to 23MB …
yeah, rather no comment.
30th October 2019 — Tunnel Circuit Awards Ceremony
Virtual Competitors,
I hope you are having a great week so far. We are getting ready for the Awards
Ceremony tomorrow and wanted to make sure we had all of the updated
information.
You can access the direct link below to watch the stream of the awards
ceremony starting at 11:00 am EDT https://www.youtube.com/watch?v=JxwL2eGQQs8.
For those who are unable to view the live stream in real time, the broadcast
will be recorded and posted to DARPAtv at the conclusion of the event.
As a reminder at the end of the program Dr. Chung will respond to questions
collected via the community forum. In order to submit questions you will need
to create an account (if you haven’t already) and post the question under the
“Questions” category and add the tag virtual-tunnelawards. A limited number
of questions will be answered live; however, Q&A will be posted on the
community forum after the meeting.
OK, so for us it means today (30th October 2019) at 4pm Central European Time.
I posted 3 questions related to my complains:
and in less than 10 hours we should see:
31st October 2019 — Robotika Wins 100000USD!
table of results |
3rd November 2019 — Robotika Virtual Videos
OSGAR logfies
- 01/A10F2400L — aws-zmq-subt-x2-191011_124211.log, https://youtu.be/WiunGpoDvUE
- 02/B60F1500R — aws-zmq-subt-x2-191011_124204.log, https://youtu.be/HsycOeeyqtY
- 02/D400F100F400F1000R — aws-zmq-subt-x2-191011_124206.log, https://youtu.be/zm7ktpMHpWg
- 03/D400F100F400F1000R — aws-zmq-subt-x2-191011_124228.log, https://youtu.be/XX_HBa2I4ZE
- 10/C120F100F400F1000L — aws-zmq-subt-x2-191022_183425.log, https://youtu.be/yJhcvtdr-Oo
- 15/B60F1500R — aws-zmq-subt-x2-191023_125435.log, https://youtu.be/mvNtzEm7ME0
For more info see issue #258.