Event-based cameras are a new type of vision sensor in which pixels operate independently and respond asynchronously to changes in brightness with microsecond resolution, instead of providing standard intensity frames. Compared with traditional cameras, event-based cameras have low latency, no motion blur, and high dynamic range (HDR), which provide possibilities for robots to deal with some challenging scenes. We propose a visual-inertial odometry for stereo event-based cameras based on Error-State Kalman Filter (ESKF). The vision module updates the pose by relying on the edge alignment of a semi-dense 3D map to a 2D image, while the IMU module updates the pose using median integration. We evaluate our method on public datasets with general 6-DoF motion (three-axis translation and three-axis rotation) and compare the results against the ground truth. We compared our results with those from other methods, demonstrating the effectiveness of our approach.