Currently, I want to estimate the posture of the hand from a single-view RGB image, so I'm thinking of doing it by machine learning.
It is an image of the joint position as a label.
Regarding this label, will the array be incorporated as learning data?
The output is the estimated joint position, and I wish I could synthesize it into a hand model.
I have little experience in machine learning, so I don't know how to make learning data.I would appreciate it if you could let me know.
Note (Specific method): Take a picture of the hand holding the ball from its center (in this case, the center of the transparent ball).I know the three-dimensional position of the joints and the back of the hand because I use the motion capture.I think it would be nice if we could use this as a label for learning data and hang it on existing networks such as CNN.The image of the concrete method is that the input is only an image, and the output is an array of estimated three-dimensional positions of the joints and back of the hand.I see classification in the class, but it's regressive, but there aren't many references...machine-learning
The use of the word "label" sounds strange, but in conclusion, it is possible.
For example, Christian Zimmermann&Thomas Brox."Learning to Estimate 3D Hand Pose from Single RGB Images" presented in ICCV 2017.
There are many studies called hand posture estimation, so there are sites that summarize other studies.
© 2023 OneMinuteCode. All rights reserved.