The subject of the paper is a multimodal dataset (DPJAIT) containing drone flights prepared in two variants - simulation-based and with real measurements captured by the gold standard Vicon system. It contains video sequences registered by the synchronized and calibrated multicamera set as well as reference 3D drone positions in successive time instants obtained from simulation procedure or using the motion capture technique. Moreover, there are scenarios with ArUco markers in the scene with known 3D positions and RGB cameras mounted on drones for which internal parameters are given. Three applications of 3D tracking are demonstrated. They are based on the overdetermined set of linear equations describing camera projection, particle swarm optimization, and the determination of the extrinsic matrix of the camera attached to the drone utilizing recognized ArUco markers.