Intelligent Systems
Note: This research group has relocated.


2024


no image
Physics-Based Rigid Body Object Tracking and Friction Filtering From RGB-D Videos

Kandukuri, R. K., Strecke, M., Stueckler, J.

In Proceedings of the International Conference on 3D Vision (3DV), 2024 (inproceedings)

Abstract
Physics-based understanding of object interactions from sensory observations is an essential capability in augmented reality and robotics. It enables to capture the properties of a scene for simulation and control. In this paper, we propose a novel approach for real-to-sim which tracks rigid objects in 3D from RGB-D images and infers physical properties of the objects. We use a differentiable physics simulation as state-transition model in an Extended Kalman Filter which can model contact and friction for arbitrary mesh-based shapes and in this way estimate physically plausible trajectories. We demonstrate that our approach can filter position, orientation, velocities, and concurrently can estimate the coefficient of friction of the objects. We analyze our approach on various sliding scenarios in synthetic image sequences of single objects and colliding objects. We also demonstrate and evaluate our approach on a real-world dataset. We make our novel benchmark datasets publicly available to foster future research in this novel problem setting and comparison with our method.

preprint supplemental video dataset link (url) DOI [BibTex]

2024


no image
Learning a Terrain- and Robot-Aware Dynamics Model for Autonomous Mobile Robot Navigation

Achterhold, J., Guttikonda, S., Kreber, J. U., Li, H., Stueckler, J.

CoRR abs/2409.11452, 2024, Preprint submitted to Robotics and Autonomous Systems Journal. https://arxiv.org/abs/2409.11452 (techreport) Submitted

Abstract
Mobile robots should be capable of planning cost-efficient paths for autonomous navigation. Typically, the terrain and robot properties are subject to variations. For instance, properties of the terrain such as friction may vary across different locations. Also, properties of the robot may change such as payloads or wear and tear, e.g., causing changing actuator gains or joint friction. Autonomous navigation approaches should thus be able to adapt to such variations. In this article, we propose a novel approach for learning a probabilistic, terrain- and robot-aware forward dynamics model (TRADYN) which can adapt to such variations and demonstrate its use for navigation. Our learning approach extends recent advances in meta-learning forward dynamics models based on Neural Processes for mobile robot navigation. We evaluate our method in simulation for 2D navigation of a robot with uni-cycle dynamics with varying properties on terrain with spatially varying friction coefficients. In our experiments, we demonstrate that TRADYN has lower prediction error over long time horizons than model ablations which do not adapt to robot or terrain variations. We also evaluate our model for navigation planning in a model-predictive control framework and under various sources of noise. We demonstrate that our approach yields improved performance in planning control-efficient paths by taking robot and terrain properties into account.

preprint [BibTex]

preprint [BibTex]


no image
Physically Plausible Object Pose Refinement in Cluttered Scenes

Strecke, M., Stueckler, J.

In Proceedings of the German Conference on Pattern Recognition (GCPR), 2024, to appear (inproceedings) To be published

code preprint (submitted version) [BibTex]

code preprint (submitted version) [BibTex]


no image
Analytical Uncertainty-Based Loss Weighting in Multi-Task Learning

Kirchdorfer, L., Elich, C., Kutsche, S., Stuckenschmidt, H., Schott, L., Köhler, J. M.

In Proceedings of the German Conference on Pattern Recognition (GCPR), 2024, to appear (inproceedings) To be published

preprint [BibTex]

preprint [BibTex]


no image
Attention Normalization Impacts Cardinality Generalization in Slot Attention

Krimmel, M., Achterhold, J., Stueckler, J.

In Transactions on Machine Learning Research (TMLR), 2024 (article)

Abstract
Object-centric scene decompositions are important representations for downstream tasks in fields such as computer vision and robotics. The recently proposed Slot Attention module, already leveraged by several derivative works for image segmentation and object tracking in videos, is a deep learning component which performs unsupervised object-centric scene decomposition on input images. It is based on an attention architecture, in which latent slot vectors, which hold compressed information on objects, attend to localized perceptual features from the input image. In this paper, we demonstrate that design decisions on normalizing the aggregated values in the attention architecture have considerable impact on the capabilities of Slot Attention to generalize to a higher number of slots and objects as seen during training. We propose and investigate alternatives to the original normalization scheme which increase the generalization capabilities of Slot Attention to varying slot and object counts, resulting in performance gains on the task of unsupervised image segmentation. The newly proposed normalizations represent minimal and easy to implement modifications of the usual Slot Attention module, changing the value aggregation mechanism from a weighted mean operation to a scaled weighted sum operation.

preprint source code video link (url) [BibTex]


no image
Event-based Non-Rigid Reconstruction of Low-Rank Parametrized Deformations from Contours

Xue, Y., Li, H., Leutenegger, S., Stueckler, J.

International Journal of Computer Vision (IJCV), 2024 (article)

Abstract
Visual reconstruction of fast non-rigid object deformations over time is a challenge for conventional frame-based cameras. In recent years, event cameras have gained significant attention due to their bio-inspired properties, such as high temporal resolution and high dynamic range. In this paper, we propose a novel approach for reconstructing such deformations using event measurements. Under the assumption of a static background, where all events are generated by the motion, our approach estimates the deformation of objects from events generated at the object contour in a probabilistic optimization framework. It associates events to mesh faces on the contour and maximizes the alignment of the line of sight through the event pixel with the associated face. In experiments on synthetic and real data of human body motion, we demonstrate the advantages of our method over state-of-the-art optimization and learning-based approaches for reconstructing the motion of human arms and hands. In addition, we propose an efficient event stream simulator to synthesize realistic event data for human motion.

DOI [BibTex]

DOI [BibTex]


no image
Incremental Few-Shot Adaptation for Non-Prehensile Object Manipulation using Parallelizable Physics Simulators

Baumeister, F., Mack, L., Stueckler, J.

CoRR abs/2409.13228, CoRR, 2024, Submitted to IEEE International Conference on Robotics and Automation (ICRA) 2025 (techreport) Submitted

Abstract
Few-shot adaptation is an important capability for intelligent robots that perform tasks in open-world settings such as everyday environments or flexible production. In this paper, we propose a novel approach for non-prehensile manipulation which iteratively adapts a physics-based dynamics model for model-predictive control. We adapt the parameters of the model incrementally with a few examples of robot-object interactions. This is achieved by sampling-based optimization of the parameters using a parallelizable rigid-body physics simulation as dynamic world model. In turn, the optimized dynamics model can be used for model-predictive control using efficient sampling-based optimization. We evaluate our few-shot adaptation approach in several object pushing experiments in simulation and with a real robot.

preprint supplemental video link (url) [BibTex]

preprint supplemental video link (url) [BibTex]


no image
Examining Common Paradigms in Multi-Task Learning

Elich, C., Kirchdorfer, L., Köhler, J. M., Schott, L.

In Proceedings of the German Conference on Pattern Recognition (GCPR), 2024, to appear (inproceedings) To be published

preprint [BibTex]

preprint [BibTex]


no image
Online Calibration of a Single-Track Ground Vehicle Dynamics Model by Tight Fusion with Visual-Inertial Odometry

Li, H., Stueckler, J.

In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), 2024 (inproceedings)

Abstract
Wheeled mobile robots need the ability to estimate their motion and the effect of their control actions for navigation planning. In this paper, we present ST-VIO, a novel approach which tightly fuses a single-track dynamics model for wheeled ground vehicles with visual-inertial odometry (VIO). Our method calibrates and adapts the dynamics model online to improve the accuracy of forward prediction conditioned on future control inputs. The single-track dynamics model approximates wheeled vehicle motion under specific control inputs on flat ground using ordinary differential equations. We use a singularity-free and differentiable variant of the single-track model to enable seamless integration as dynamics factor into VIO and to optimize the model parameters online together with the VIO state variables. We validate our method with real-world data in both indoor and outdoor environments with different terrain types and wheels. In experiments, we demonstrate that ST-VIO can not only adapt to wheel or ground changes and improve the accuracy of prediction under new control inputs, but can even improve tracking accuracy.

preprint supplemental video code datasets link (url) DOI [BibTex]

preprint supplemental video code datasets link (url) DOI [BibTex]

2023


no image
Black-Box vs. Gray-Box: A Case Study on Learning Table Tennis Ball Trajectory Prediction with Spin and Impacts

Achterhold, J., Tobuschat, P., Ma, H., Büchler, D., Muehlebach, M., Stueckler, J.

In Proceedings of the 5th Annual Learning for Dynamics and Control Conference (L4DC), 211, pages: 878-890, Proceedings of Machine Learning Research, (Editors: Nikolai Matni, Manfred Morari and George J. Pappa), PMLR, June 2023 (inproceedings)

preprint code link (url) [BibTex]

2023

preprint code link (url) [BibTex]


no image
Object-Level Dynamic Scene Reconstruction With Physical Plausibility From RGB-D Images

Strecke, M. F.

Eberhard Karls Universität Tübingen, Tübingen, 2023 (phdthesis)

Abstract
Humans have the remarkable ability to perceive and interact with objects in the world around them. They can easily segment objects from visual data and have an intuitive understanding of how physics influences objects. By contrast, robots are so far often constrained to tailored environments for a specific task, due to their inability to reconstruct a versatile and accurate scene representation. In this thesis, we combine RGB-D video data with background knowledge of real-world physics to develop such a representation for robots.

Our contributions can be separated into two main parts: a dynamic object tracking tool and optimization frameworks that allow for improving shape reconstructions based on physical plausibility. The dynamic object tracking tool "EM-Fusion" detects, segments, reconstructs, and tracks objects from RGB-D video data. We propose a probabilistic data association approach for attributing the image pixels to the different moving objects in the scene. This allows us to track and reconstruct moving objects and the background scene with state-of-the art accuracy and robustness towards occlusions.

We investigate two ways of further optimizing the reconstructed shapes of moving objects based on physical plausibility. The first of these, "Co-Section", includes physical plausibility by reasoning about the empty space around an object. We observe that no two objects can occupy the same space at the same time and that the depth images in the input video provide an estimate of observed empty space. Based on these observations, we propose intersection and hull constraints, which we combine with the observed surfaces in a global optimization approach. Compared to EM-Fusion, which only reconstructs the observed surface, Co-Section optimizes watertight shapes. These watertight shapes provide a rough estimate of unseen surfaces and could be useful as initialization for further refinement, e.g., by interactive perception. In the second optimization approach, "DiffSDFSim", we reason about object shapes based on physically plausible object motion. We observe that object trajectories after collisions depend on the object's shape, and extend a differentiable physics simulation for optimizing object shapes together with other physical properties (e.g., forces, masses, friction) based on the motion of the objects and their interactions. Our key contributions are using signed distance function models for representing shapes and a novel method for computing gradients that models the dependency of the time of contact on object shapes. We demonstrate that our approach recovers target shapes well by fitting to target trajectories and depth observations. Further, the ground-truth trajectories are recovered well in simulation using the resulting shape and physical properties. This enables predictions about the future motion of objects by physical simulation.

We anticipate that our contributions can be useful building blocks in the development of 3D environment perception for robots. The reconstruction of individual objects as in EM-Fusion is a key ingredient required for interactions with objects. Completed shapes as the ones provided by Co-Section provide useful cues for planning interactions like grasping of objects. Finally, the recovery of shape and other physical parameters using differentiable simulation as in DiffSDFSim allows simulating objects and thus predicting the effects of interactions. Future work might extend the presented works for interactive perception of dynamic environments by comparing these predictions with observed real-world interactions to further improve the reconstructions and physical parameter estimations.

link (url) DOI [BibTex]


Visual-Inertial and Leg Odometry Fusion for Dynamic Locomotion
Visual-Inertial and Leg Odometry Fusion for Dynamic Locomotion

Dhédin, V., Li, H., Khorshidi, S., Mack, L., Ravi, A. K. C., Meduri, A., Shah, P., Grimminger, F., Righetti, L., Khadiv, M., Stueckler, J.

In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), 2023 (inproceedings)

Abstract
Implementing dynamic locomotion behaviors on legged robots requires a high-quality state estimation module. Especially when the motion includes flight phases, state-of-the-art approaches fail to produce reliable estimation of the robot posture, in particular base height. In this paper, we propose a novel approach for combining visual-inertial odometry (VIO) with leg odometry in an extended Kalman filter (EKF) based state estimator. The VIO module uses a stereo camera and IMU to yield low-drift 3D position and yaw orientation and drift-free pitch and roll orientation of the robot base link in the inertial frame. However, these values have a considerable amount of latency due to image processing and optimization, while the rate of update is quite low which is not suitable for low-level control. To reduce the latency, we predict the VIO state estimate at the rate of the IMU measurements of the VIO sensor. The EKF module uses the base pose and linear velocity predicted by VIO, fuses them further with a second high-rate IMU and leg odometry measurements, and produces robot state estimates with a high frequency and small latency suitable for control. We integrate this lightweight estimation framework with a nonlinear model predictive controller and show successful implementation of a set of agile locomotion behaviors, including trotting and jumping at varying horizontal speeds, on a torque-controlled quadruped robot.

preprint video link (url) DOI [BibTex]

preprint video link (url) DOI [BibTex]


no image
Learning-based Relational Object Matching Across Views

Elich, C., Armeni, I., Oswald, M. R., Pollefeys, M., Stueckler, J.

In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), 2023 (inproceedings)

Abstract
Intelligent robots require object-level scene understanding to reason about possible tasks and interactions with the environment. Moreover, many perception tasks such as scene reconstruction, image retrieval, or place recognition can benefit from reasoning on the level of objects. While keypoint-based matching can yield strong results for finding correspondences for images with small to medium view point changes, for large view point changes, matching semantically on the object-level becomes advantageous. In this paper, we propose a learning-based approach which combines local keypoints with novel object-level features for matching object detections between RGB images. We train our object-level matching features based on appearance and inter-frame and cross-frame spatial relations between objects in an associative graph neural network. We demonstrate our approach in a large variety of views on realistically rendered synthetic images. Our approach compares favorably to previous state-of-the-art object-level matching approaches and achieves improved performance over a pure keypoint-based approach for large view-point changes.

preprint code link (url) DOI [BibTex]

preprint code link (url) DOI [BibTex]


no image
Context-Conditional Navigation with a Learning-Based Terrain- and Robot-Aware Dynamics Model

Guttikonda, S., Achterhold, J., Li, H., Boedecker, J., Stueckler, J.

In Proceedings of the European Conference on Mobile Robots (ECMR), 2023 (inproceedings)

Abstract
In autonomous navigation settings, several quantities can be subject to variations. Terrain properties such as friction coefficients may vary over time depending on the location of the robot. Also, the dynamics of the robot may change due to, e.g., different payloads, changing the system's mass, or wear and tear, changing actuator gains or joint friction. An autonomous agent should thus be able to adapt to such variations. In this paper, we develop a novel probabilistic, terrain- and robot-aware forward dynamics model, termed TRADYN, which is able to adapt to the above-mentioned variations. It builds on recent advances in meta-learning forward dynamics models based on Neural Processes. We evaluate our method in a simulated 2D navigation setting with a unicycle-like robot and different terrain layouts with spatially varying friction coefficients. In our experiments, the proposed model exhibits lower prediction error for the task of long-horizon trajectory prediction, compared to non-adaptive ablation models. We also evaluate our model on the downstream task of navigation planning, which demonstrates improved performance in planning control-efficient paths by taking robot and terrain properties into account.

preprint code link (url) DOI [BibTex]

preprint code link (url) DOI [BibTex]

2022


no image
Weakly Supervised Learning of Multi-Object 3D Scene Decompositions Using Deep Shape Priors

Elich, C., Oswald, M. R., Pollefeys, M., Stueckler, J.

Computer Vision and Image Understanding (CVIU), 220, July 2022 (article)

Abstract
Representing scenes at the granularity of objects is a prerequisite for scene understanding and decision making. We propose PriSMONet, a novel approach based on Prior Shape knowledge for learning Multi-Object 3D scene decomposition and representations from single images. Our approach learns to decompose images of synthetic scenes with multiple objects on a planar surface into its constituent scene objects and to infer their 3D properties from a single view. A recurrent encoder regresses a latent representation of 3D shape, pose and texture of each object from an input RGB image. By differentiable rendering, we train our model to decompose scenes from RGB-D images in a self-supervised way. The 3D shapes are represented continuously in function-space as signed distance functions which we pre-train from example shapes in a supervised way. These shape priors provide weak supervision signals to better condition the challenging overall learning task. We evaluate the accuracy of our model in inferring 3D scene layout, demonstrate its generative capabilities, assess its generalization to real images, and point out benefits of the learned representation.

Link Preprint link (url) DOI Project Page [BibTex]

2022

Link Preprint link (url) DOI Project Page [BibTex]


no image
Visual-Inertial Odometry with Online Calibration of Velocity-Control Based Kinematic Motion Models

Li, H., Stueckler, J.

IEEE Robotics and Automation Letters, 7(3):6415-6422, July 2022, Accepted for oral presentation at IEEE ICRA 2023 (article)

Abstract
Visual-inertial odometry (VIO) is an important technology for autonomous robots with power and payload constraints. In this paper, we propose a novel approach for VIO with stereo cameras which integrates and calibrates the velocity-control based kinematic motion model of wheeled mobile robots online. Including such a motion model can help to improve the accuracy of VIO. Compared to several previous approaches proposed to integrate wheel odometer measurements for this purpose, our method does not require wheel encoders and can be applied when the robot motion can be modeled with velocity-control based kinematic motion model. We use radial basis function (RBF) kernels to compensate for the time delay and deviations between control commands and actual robot motion. The motion model is calibrated online by the VIO system and can be used as a forward model for motion control and planning. We evaluate our approach with data obtained in variously sized indoor environments, demonstrate improvements over a pure VIO method, and evaluate the prediction accuracy of the online calibrated model.

preprint link (url) DOI Project Page Project Page [BibTex]

preprint link (url) DOI Project Page Project Page [BibTex]


no image
Observability Analysis of Visual-Inertial Odometry with Online Calibration of Velocity-Control Based Kinematic Motion Models

Li, H., Stueckler, J.

abs/2204.06651, CoRR/arxiv, 2022 (techreport)

Abstract
In this paper, we analyze the observability of the visual-inertial odometry (VIO) using stereo cameras with a velocity-control based kinematic motion model. Previous work shows that in general case the global position and yaw are unobservable in VIO system, additionally the roll and pitch become also unobservable if there is no rotation. We prove that by integrating a planar motion constraint roll and pitch become observable. We also show that the parameters of the motion model are observable.

link (url) [BibTex]


no image
Event-based Non-Rigid Reconstruction from Contours

(Best Student Paper Award)

Xue, Y., Li, H., Leutenegger, S., Stueckler, J.

In Proceedings of the British Machine Vision Conference (BMVC), 2022 (inproceedings)

Abstract
Visual reconstruction of fast non-rigid object deformations over time is a challenge for conventional frame-based cameras. In this paper, we propose a novel approach for reconstructing such deformations using measurements from event-based cameras. Our approach estimates the deformation of objects from events generated at the object contour in a probabilistic optimization framework. It associates events to mesh faces on the contour and maximizes the alignment of the line of sight through the event pixel with the associated face. In experiments on synthetic and real data, we demonstrate the advantages of our method over state-of-the-art optimization and learning-based approaches for reconstructing the motion of human hands.

preprint video link (url) [BibTex]

preprint video link (url) [BibTex]


no image
Learning Temporally Extended Skills in Continuous Domains as Symbolic Actions for Planning

Achterhold, J., Krimmel, M., Stueckler, J.

In Proceedings of The 6th Conference on Robot Learning , 205, pages: 225-236 , Proceedings of Machine Learning Research , 6th Annual Conference on Robot Learning (CoRL 2022) , 2022 (inproceedings)

Abstract
Problems which require both long-horizon planning and continuous control capabilities pose significant challenges to existing reinforcement learning agents. In this paper we introduce a novel hierarchical reinforcement learning agent which links temporally extended skills for continuous control with a forward model in a symbolic discrete abstraction of the environment’s state for planning. We term our agent SEADS for Symbolic Effect-Aware Diverse Skills. We formulate an objective and corresponding algorithm which leads to unsupervised learning of a diverse set of skills through intrinsic motivation given a known state abstraction. The skills are jointly learned with the symbolic forward model which captures the effect of skill execution in the state abstraction. After training, we can leverage the skills as symbolic actions using the forward model for long-horizon planning and subsequently execute the plan using the learned continuous-action control skills. The proposed algorithm learns skills and forward models that can be used to solve complex tasks which require both continuous control and long-horizon planning capabilities with high success rate. It compares favorably with other flat and hierarchical reinforcement learning baseline agents and is successfully demonstrated with a real robot.

preprint project website link (url) Project Page [BibTex]

preprint project website link (url) Project Page [BibTex]

2021


no image
DiffSDFSim: Differentiable Rigid-Body Dynamics With Implicit Shapes

Strecke, M., Stückler, J.

In 2021 International Conference on 3D Vision (3DV 2021) , pages: 96-105 , International Conference on 3D Vision (3DV 2021) , December 2021 (inproceedings)

Project website Preprint Code link (url) DOI Project Page [BibTex]

2021

Project website Preprint Code link (url) DOI Project Page [BibTex]


no image
Physically Plausible Tracking & Reconstruction of Dynamic Objects

Strecke, M., Stückler, J.

KIT Science Week Scientific Conference & DGR-Days 2021, October 2021 (talk)

[BibTex]

[BibTex]


no image
Explore the Context: Optimal Data Collection for Context-Conditional Dynamics Models

Achterhold, J., Stueckler, J.

In Proceedings of The 24th International Conference on Artificial Intelligence and Statistics (AISTATS 2021) , 130, JMLR, Cambridge, MA, Titel The 24th International Conference on Artificial Intelligence and Statistics (AISTATS 2021) , April 2021, preprint CoRR abs/2102.11394 (inproceedings)

Abstract
In this paper, we learn dynamics models for parametrized families of dynamical systems with varying properties. The dynamics models are formulated as stochastic processes conditioned on a latent context variable which is inferred from observed transitions of the respective system. The probabilistic formulation allows us to compute an action sequence which, for a limited number of environment interactions, optimally explores the given system within the parametrized family. This is achieved by steering the system through transitions being most informative for the context variable. We demonstrate the effectiveness of our method for exploration on a non-linear toy-problem and two well-known reinforcement learning environments.

Preprint Project page Poster link (url) Project Page [BibTex]

Preprint Project page Poster link (url) Project Page [BibTex]


no image
Tracking 6-DoF Object Motion from Events and Frames

Li, H., Stueckler, J.

In Proc. of IEEE Int. Conf. on Robotics and Automation (ICRA), 2021 (inproceedings)

preprint link (url) DOI Project Page [BibTex]

preprint link (url) DOI Project Page [BibTex]


no image
Physical Representation Learning and Parameter Identification from Video Using Differentiable Physics

Kandukuri, R., Achterhold, J., Moeller, M., Stueckler, J.

International Journal of Computer Vision, 130, pages: 3-16, 2021 (article)

link (url) DOI Project Page [BibTex]

link (url) DOI Project Page [BibTex]

2020


no image
Where Does It End? - Reasoning About Hidden Surfaces by Object Intersection Constraints

Strecke, M., Stückler, J.

In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2020), pages: 9589 - 9597, IEEE, Piscataway, NJ, IEEE/CVF International Conference on Computer Vision and Pattern Recognition (CVPR 2020), June 2020, preprint Corr abs/2004.04630 (inproceedings)

preprint project page Code DOI Project Page [BibTex]

2020

preprint project page Code DOI Project Page [BibTex]


no image
DirectShape: Photometric Alignment of Shape Priors for Visual Vehicle Pose and Shape Estimation

Wang, R., Yang, N., Stückler, J., Cremers, D.

In Proceedings of the IEEE international Conference on Robotics and Automation (ICRA), pages: 11067 - 11073, IEEE, Piscataway, NJ, IEEE International Conference on Robotics and Automation (ICRA 2020), May 2020, arXiv:1904.10097 (inproceedings)

DOI Project Page [BibTex]

DOI Project Page [BibTex]


no image
Numerical Quadrature for Probabilistic Policy Search

Vinogradska, J., Bischoff, B., Achterhold, J., Koller, T., Peters, J.

IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(1):164-175, 2020 (article)

DOI [BibTex]

DOI [BibTex]


no image
25th International Symposium on Vision, Modeling and Visualization, VMV 2020
(Editors: Jens Krüger and Matthias Nießner and Jörg Stückler), Eurographics Association, 2020 (proceedings)

[BibTex]

[BibTex]


no image
Learning to Identify Physical Parameters from Video Using Differentiable Physics

Kandukuri, R., Achterhold, J., Moeller, M., Stueckler, J.

Proc. of the 42th German Conference on Pattern Recognition (GCPR), 2020, GCPR 2020 Honorable Mention, preprint https://arxiv.org/abs/2009.08292 (conference)

link (url) Project Page [BibTex]

link (url) Project Page [BibTex]


no image
TUM Flyers: Vision-Based MAV Navigation for Systematic Inspection of Structures

Usenko, V., Stumberg, L. V., Stückler, J., Cremers, D.

In Bringing Innovative Robotic Technologies from Research Labs to Industrial End-users: The Experience of the European Robotics Challenges, 136, pages: 189-209, Springer Tracts in Advanced Robotics, Springer International Publishing, 2020 (inbook)

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Planning from Images with Deep Latent Gaussian Process Dynamics

Bosch, N., Achterhold, J., Leal-Taixe, L., Stückler, J.

Proceedings of the 2nd Conference on Learning for Dynamics and Control (L4DC), 120, pages: 640-650, Proceedings of Machine Learning Research (PMLR), (Editors: Alexandre M. Bayen and Ali Jadbabaie and George Pappas and Pablo A. Parrilo and Benjamin Recht and Claire Tomlin and Melanie Zeilinger), 2020, preprint arXiv:2005.03770 (conference)

Ppreprint Project page Code poster link (url) Project Page [BibTex]

Ppreprint Project page Code poster link (url) Project Page [BibTex]


no image
Sample-efficient Cross-Entropy Method for Real-time Planning

Pinneri, C., Sawant, S., Blaes, S., Achterhold, J., Stueckler, J., Rolinek, M., Martius, G.

In Conference on Robot Learning 2020, 2020 (inproceedings)

Abstract
Trajectory optimizers for model-based reinforcement learning, such as the Cross-Entropy Method (CEM), can yield compelling results even in high-dimensional control tasks and sparse-reward environments. However, their sampling inefficiency prevents them from being used for real-time planning and control. We propose an improved version of the CEM algorithm for fast planning, with novel additions including temporally-correlated actions and memory, requiring 2.7-22x less samples and yielding a performance increase of 1.2-10x in high-dimensional control problems.

Paper Code Spotlight-Video link (url) Project Page [BibTex]


no image
Visual-Inertial Mapping with Non-Linear Factor Recovery

Usenko, V., Demmel, N., Schubert, D., Stückler, J., Cremers, D.

IEEE Robotics and Automation Letters (RA-L), 5(2):422-429, 2020, presented at IEEE International Conference on Robotics and Automation (ICRA) 2020, preprint arXiv:1904.06504 (article)

Abstract
Cameras and inertial measurement units are complementary sensors for ego-motion estimation and environment mapping. Their combination makes visual-inertial odometry (VIO) systems more accurate and robust. For globally consistent mapping, however, combining visual and inertial information is not straightforward. To estimate the motion and geometry with a set of images large baselines are required. Because of that, most systems operate on keyframes that have large time intervals between each other. Inertial data on the other hand quickly degrades with the duration of the intervals and after several seconds of integration, it typically contains only little useful information. In this paper, we propose to extract relevant information for visual-inertial mapping from visual-inertial odometry using non-linear factor recovery. We reconstruct a set of non-linear factors that make an optimal approximation of the information on the trajectory accumulated by VIO. To obtain a globally consistent map we combine these factors with loop-closing constraints using bundle adjustment. The VIO factors make the roll and pitch angles of the global map observable, and improve the robustness and the accuracy of the mapping. In experiments on a public benchmark, we demonstrate superior performance of our method over the state-of-the-art approaches.

Code Preprint link (url) Project Page [BibTex]

Code Preprint link (url) Project Page [BibTex]


no image
Learning to Adapt Multi-View Stereo by Self-Supervision

Mallick, A., Stückler, J., Lensch, H.

In Proceedings of the British Machine Vision Conference (BMVC), 2020, preprint https://arxiv.org/abs/2009.13278 (inproceedings)

link (url) Project Page [BibTex]

link (url) Project Page [BibTex]

2019


{EM}-Fusion: Dynamic Object-Level SLAM With Probabilistic Data Association
EM-Fusion: Dynamic Object-Level SLAM With Probabilistic Data Association

Strecke, M., Stückler, J.

In Proceedings IEEE/CVF International Conference on Computer Vision 2019 (ICCV), pages: 5864-5873, IEEE, 2019 IEEE/CVF International Conference on Computer Vision (ICCV), October 2019 (inproceedings)

preprint Project page Code Poster DOI Project Page [BibTex]

2019

preprint Project page Code Poster DOI Project Page [BibTex]


no image
Learning to Disentangle Latent Physical Factors for Video Prediction

Zhu, D., Munderloh, M., Rosenhahn, B., Stückler, J.

In Pattern Recognition - Proceedings German Conference on Pattern Recognition (GCPR), Springer International, German Conference on Pattern Recognition (GCPR), September 2019 (inproceedings)

dataset & evaluation code video preprint DOI Project Page [BibTex]

dataset & evaluation code video preprint DOI Project Page [BibTex]


no image
3D Birds-Eye-View Instance Segmentation

Elich, C., Engelmann, F., Kontogianni, T., Leibe, B.

In Pattern Recognition - Proceedings 41st DAGM German Conference, DAGM GCPR 2019, pages: 48-61, Lecture Notes in Computer Science (LNCS) 11824, (Editors: Fink G.A., Frintrop S., Jiang X.), Springer, 2019 German Conference on Pattern Recognition (GCPR), September 2019, ISSN: 03029743 (inproceedings)

[BibTex]

[BibTex]

2018


no image
Direct Sparse Odometry With Rolling Shutter

Schubert, D., Usenko, V., Demmel, N., Stueckler, J., Cremers, D.

In European Conference on Computer Vision (ECCV), September 2018, oral presentation (inproceedings)

[BibTex]

2018

[BibTex]


no image
Deep Virtual Stereo Odometry: Leveraging Deep Depth Prediction for Monocular Direct Sparse Odometry

Yang, N., Wang, R., Stueckler, J., Cremers, D.

In European Conference on Computer Vision (ECCV), September 2018, oral presentation, preprint https://arxiv.org/abs/1807.02570 (inproceedings)

link (url) [BibTex]

link (url) [BibTex]


no image
The TUM VI Benchmark for Evaluating Visual-Inertial Odometry

Schubert, D., Goll, T., Demmel, N., Usenko, V., Stueckler, J., Cremers, D.

In IEEE International Conference on Intelligent Robots and Systems (IROS), 2018, arXiv:1804.06120 (inproceedings)

[BibTex]

[BibTex]


no image
Detailed Dense Inference with Convolutional Neural Networks via Discrete Wavelet Transform

Ma, L., Stueckler, J., Wu, T., Cremers, D.

arxiv, 2018, arXiv:1808.01834 (techreport)

[BibTex]

[BibTex]


no image
Variational Network Quantization

Achterhold, J., Koehler, J. M., Schmeink, A., Genewein, T.

In International Conference on Learning Representations , 2018 (inproceedings)

link (url) [BibTex]

link (url) [BibTex]


no image
Omnidirectional DSO: Direct Sparse Odometry with Fisheye Cameras

Matsuki, H., von Stumberg, L., Usenko, V., Stueckler, J., Cremers, D.

IEEE Robotics and Automation Letters (RA-L) & Int. Conference on Intelligent Robots and Systems (IROS), Robotics and Automation Letters (RA-L), IEEE, 2018 (article)

[BibTex]

[BibTex]


no image
Light field intrinsics with a deep encoder-decoder network

Alperovich, A., Johannsen, O., Strecke, M., Goldluecke, B.

In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018 (inproceedings)

link (url) [BibTex]

link (url) [BibTex]


no image
Sublabel-accurate convex relaxation with total generalized variation regularization

(DAGM Best Master's Thesis Award)

Strecke, M., Goldluecke, B.

In German Conference on Pattern Recognition (Proc. GCPR), 2018 (inproceedings)

link (url) [BibTex]

link (url) [BibTex]

2017


no image
From Monocular SLAM to Autonomous Drone Exploration

von Stumberg, L., Usenko, V., Engel, J., Stueckler, J., Cremers, D.

In European Conference on Mobile Robots (ECMR), September 2017 (inproceedings)

[BibTex]

2017

[BibTex]


no image
Multi-View Deep Learning for Consistent Semantic Mapping with RGB-D Cameras

Ma, L., Stueckler, J., Kerl, C., Cremers, D.

In IEEE International Conference on Intelligent Robots and Systems (IROS), Vancouver, Canada, 2017 (inproceedings)

[BibTex]

[BibTex]


no image
Accurate depth and normal maps from occlusion-aware focal stack symmetry

Strecke, M., Alperovich, A., Goldluecke, B.

In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017 (inproceedings)

source code link (url) [BibTex]

source code link (url) [BibTex]


no image
Semi-Supervised Deep Learning for Monocular Depth Map Prediction

Kuznietsov, Y., Stueckler, J., Leibe, B.

In IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), 2017 (inproceedings)

[BibTex]

[BibTex]


no image
Shadow and Specularity Priors for Intrinsic Light Field Decomposition

Alperovich, A., Johannsen, O., Strecke, M., Goldluecke, B.

In Energy Minimization Methods in Computer Vision and Pattern Recognition (EMMCVPR), 2017 (inproceedings)

link (url) [BibTex]

link (url) [BibTex]