rmz92002 commited on
Commit
5484092
·
verified ·
1 Parent(s): e647a50

Upload 95 files

Browse files
This view is limited to 50 files because it contains too many changes.   See raw diff
Files changed (50) hide show
  1. .gitattributes +46 -0
  2. GUIs/README.md +60 -0
  3. GUIs/cut_and_drag.py +1904 -0
  4. LICENSE +201 -0
  5. README.md +169 -0
  6. assets/gui.png +3 -0
  7. assets/logo_arxiv.svg +1 -0
  8. assets/logo_page.svg +1 -0
  9. assets/logo_paper.svg +1 -0
  10. assets/teaser.gif +3 -0
  11. docs/HUGGINGFACE.md +137 -0
  12. examples/camcontrol_Bridge/first_frame.png +3 -0
  13. examples/camcontrol_Bridge/mask.mp4 +3 -0
  14. examples/camcontrol_Bridge/motion_signal.mp4 +3 -0
  15. examples/camcontrol_Bridge/prompt.txt +1 -0
  16. examples/camcontrol_ConcertCrowd/first_frame.png +3 -0
  17. examples/camcontrol_ConcertCrowd/mask.mp4 +3 -0
  18. examples/camcontrol_ConcertCrowd/motion_signal.mp4 +3 -0
  19. examples/camcontrol_ConcertCrowd/prompt.txt +1 -0
  20. examples/camcontrol_ConcertStage/first_frame.png +3 -0
  21. examples/camcontrol_ConcertStage/mask.mp4 +3 -0
  22. examples/camcontrol_ConcertStage/motion_signal.mp4 +3 -0
  23. examples/camcontrol_ConcertStage/prompt.txt +1 -0
  24. examples/camcontrol_RiverOcean/first_frame.png +3 -0
  25. examples/camcontrol_RiverOcean/mask.mp4 +3 -0
  26. examples/camcontrol_RiverOcean/motion_signal.mp4 +3 -0
  27. examples/camcontrol_RiverOcean/prompt.txt +1 -0
  28. examples/camcontrol_SpiderMan/first_frame.png +3 -0
  29. examples/camcontrol_SpiderMan/mask.mp4 +3 -0
  30. examples/camcontrol_SpiderMan/motion_signal.mp4 +3 -0
  31. examples/camcontrol_SpiderMan/prompt.txt +1 -0
  32. examples/camcontrol_Volcano/first_frame.png +3 -0
  33. examples/camcontrol_Volcano/mask.mp4 +3 -0
  34. examples/camcontrol_Volcano/motion_signal.mp4 +3 -0
  35. examples/camcontrol_Volcano/prompt.txt +1 -0
  36. examples/camcontrol_VolcanoTitan/first_frame.png +3 -0
  37. examples/camcontrol_VolcanoTitan/mask.mp4 +3 -0
  38. examples/camcontrol_VolcanoTitan/motion_signal.mp4 +3 -0
  39. examples/camcontrol_VolcanoTitan/prompt.txt +1 -0
  40. examples/cutdrag_cog_Monkey/first_frame.png +3 -0
  41. examples/cutdrag_cog_Monkey/mask.mp4 +0 -0
  42. examples/cutdrag_cog_Monkey/motion_signal.mp4 +3 -0
  43. examples/cutdrag_cog_Monkey/prompt.txt +1 -0
  44. examples/cutdrag_svd_Fish/first_frame.png +3 -0
  45. examples/cutdrag_svd_Fish/mask.mp4 +0 -0
  46. examples/cutdrag_svd_Fish/motion_signal.mp4 +3 -0
  47. examples/cutdrag_wan_Birds/first_frame.png +3 -0
  48. examples/cutdrag_wan_Birds/mask.mp4 +0 -0
  49. examples/cutdrag_wan_Birds/motion_signal.mp4 +3 -0
  50. examples/cutdrag_wan_Birds/prompt.txt +1 -0
.gitattributes CHANGED
@@ -33,3 +33,49 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ assets/gui.png filter=lfs diff=lfs merge=lfs -text
37
+ assets/teaser.gif filter=lfs diff=lfs merge=lfs -text
38
+ examples/camcontrol_Bridge/first_frame.png filter=lfs diff=lfs merge=lfs -text
39
+ examples/camcontrol_Bridge/mask.mp4 filter=lfs diff=lfs merge=lfs -text
40
+ examples/camcontrol_Bridge/motion_signal.mp4 filter=lfs diff=lfs merge=lfs -text
41
+ examples/camcontrol_ConcertCrowd/first_frame.png filter=lfs diff=lfs merge=lfs -text
42
+ examples/camcontrol_ConcertCrowd/mask.mp4 filter=lfs diff=lfs merge=lfs -text
43
+ examples/camcontrol_ConcertCrowd/motion_signal.mp4 filter=lfs diff=lfs merge=lfs -text
44
+ examples/camcontrol_ConcertStage/first_frame.png filter=lfs diff=lfs merge=lfs -text
45
+ examples/camcontrol_ConcertStage/mask.mp4 filter=lfs diff=lfs merge=lfs -text
46
+ examples/camcontrol_ConcertStage/motion_signal.mp4 filter=lfs diff=lfs merge=lfs -text
47
+ examples/camcontrol_RiverOcean/first_frame.png filter=lfs diff=lfs merge=lfs -text
48
+ examples/camcontrol_RiverOcean/mask.mp4 filter=lfs diff=lfs merge=lfs -text
49
+ examples/camcontrol_RiverOcean/motion_signal.mp4 filter=lfs diff=lfs merge=lfs -text
50
+ examples/camcontrol_SpiderMan/first_frame.png filter=lfs diff=lfs merge=lfs -text
51
+ examples/camcontrol_SpiderMan/mask.mp4 filter=lfs diff=lfs merge=lfs -text
52
+ examples/camcontrol_SpiderMan/motion_signal.mp4 filter=lfs diff=lfs merge=lfs -text
53
+ examples/camcontrol_Volcano/first_frame.png filter=lfs diff=lfs merge=lfs -text
54
+ examples/camcontrol_Volcano/mask.mp4 filter=lfs diff=lfs merge=lfs -text
55
+ examples/camcontrol_Volcano/motion_signal.mp4 filter=lfs diff=lfs merge=lfs -text
56
+ examples/camcontrol_VolcanoTitan/first_frame.png filter=lfs diff=lfs merge=lfs -text
57
+ examples/camcontrol_VolcanoTitan/mask.mp4 filter=lfs diff=lfs merge=lfs -text
58
+ examples/camcontrol_VolcanoTitan/motion_signal.mp4 filter=lfs diff=lfs merge=lfs -text
59
+ examples/cutdrag_cog_Monkey/first_frame.png filter=lfs diff=lfs merge=lfs -text
60
+ examples/cutdrag_cog_Monkey/motion_signal.mp4 filter=lfs diff=lfs merge=lfs -text
61
+ examples/cutdrag_svd_Fish/first_frame.png filter=lfs diff=lfs merge=lfs -text
62
+ examples/cutdrag_svd_Fish/motion_signal.mp4 filter=lfs diff=lfs merge=lfs -text
63
+ examples/cutdrag_wan_Birds/first_frame.png filter=lfs diff=lfs merge=lfs -text
64
+ examples/cutdrag_wan_Birds/motion_signal.mp4 filter=lfs diff=lfs merge=lfs -text
65
+ examples/cutdrag_wan_Cocktail/first_frame.png filter=lfs diff=lfs merge=lfs -text
66
+ examples/cutdrag_wan_Cocktail/motion_signal.mp4 filter=lfs diff=lfs merge=lfs -text
67
+ examples/cutdrag_wan_Gardening/first_frame.png filter=lfs diff=lfs merge=lfs -text
68
+ examples/cutdrag_wan_Gardening/motion_signal.mp4 filter=lfs diff=lfs merge=lfs -text
69
+ examples/cutdrag_wan_Hamburger/first_frame.png filter=lfs diff=lfs merge=lfs -text
70
+ examples/cutdrag_wan_Hamburger/motion_signal.mp4 filter=lfs diff=lfs merge=lfs -text
71
+ examples/cutdrag_wan_Jumping/first_frame.png filter=lfs diff=lfs merge=lfs -text
72
+ examples/cutdrag_wan_Monkey/first_frame.png filter=lfs diff=lfs merge=lfs -text
73
+ examples/cutdrag_wan_Monkey/motion_signal.mp4 filter=lfs diff=lfs merge=lfs -text
74
+ examples/cutdrag_wan_Owl/first_frame.png filter=lfs diff=lfs merge=lfs -text
75
+ examples/cutdrag_wan_Owl/motion_signal.mp4 filter=lfs diff=lfs merge=lfs -text
76
+ examples/cutdrag_wan_Rhino/first_frame.png filter=lfs diff=lfs merge=lfs -text
77
+ examples/cutdrag_wan_Rhino/motion_signal.mp4 filter=lfs diff=lfs merge=lfs -text
78
+ examples/cutdrag_wan_Surfing/first_frame.png filter=lfs diff=lfs merge=lfs -text
79
+ examples/cutdrag_wan_Surfing/motion_signal.mp4 filter=lfs diff=lfs merge=lfs -text
80
+ examples/cutdrag_wan_TimeSquares/first_frame.png filter=lfs diff=lfs merge=lfs -text
81
+ examples/cutdrag_wan_TimeSquares/motion_signal.mp4 filter=lfs diff=lfs merge=lfs -text
GUIs/README.md ADDED
@@ -0,0 +1,60 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Cut and Drag GUI
2
+
3
+ We provide a GUI to generate cut-and-drag examples that will later be used for video generation given this input signal using **Time to Move**!
4
+ Given an input frame, you can cut and drag polygons from the initial image, transform their colors, and also add external images that can be dragged into the initial scene.
5
+
6
+ ## ✨ General Guide
7
+ - Select an initial image.
8
+ - Draw polygons in the image and drag them in several segments.
9
+ - During segments you can rotate, scale, and change the polygon colors!
10
+ - Polygons can also be dragged from an external image you can add into the scene.
11
+ - Write an text prompt that will be used to generate the video afterwards.
12
+ - You can preview the motion signal in an in-app demo.
13
+ - In the end, all the inputs needed for **Time-to-Move** are saved automatically in a selected output directory.
14
+
15
+ ## 🧰 Requirements
16
+ Install dependencies:
17
+ ```bash
18
+ pip install PySide6 opencv-python numpy imageio imageio-ffmpeg
19
+ ```
20
+
21
+ ## 🚀 Run
22
+ Just run the python script:
23
+ ```bash
24
+ python cut_and_drag.py
25
+ ```
26
+
27
+ ## 🖱️ How to Use
28
+ * Select Image — Click 🖼️ Select Image and choose an image.
29
+ * Choose Center Crop / Center Pad at the top of the toolbar if needed.
30
+ * Add a Polygon “cutting” the part of the image by clicking Add Polygon.
31
+ * Left-click to add points.
32
+ * After finishing drawing the polygon, press ✅ Finish Polygon Selection.
33
+ * Drag to move the polygon
34
+ * During segments you’ll see corner circles and a top dot which can be used for scaling and rotating during the segments; in the video the shape is interpolated between the initial frame status and the final segment one.
35
+ * Also, color transformation can be applied (using hue transformation) in the segments to change polygon colors.
36
+ * Click 🎯 End Segment to capture the segment annotated.
37
+ * The movement trajectory can be constructed from multiple segments: repeat move → 🎯 End Segment → move → 🎯 End Segment…
38
+ * External Image
39
+ * Another option is to add an external image to the scene.
40
+ * Click 🖼️➕ Add External Image, pick a new image.
41
+ * Position/scale/rotate it for its initial pose, then click ✅ Place External Image to lock its starting pose.
42
+ * Now animate it like before: mark a polygon, move, etc.
43
+ * Prompt
44
+ * Type any text prompt you want associated with this example; it will be used later for video generation with our method.
45
+ * Preview and Save
46
+ * Preview using ▶️ Play Demo.
47
+ * Click 💾 Save, choose an output folder and then enter a subfolder name.
48
+ * Click 🆕 New to start a new project.
49
+
50
+ ## Output Files
51
+ * first_frame.png — the initial frame for video generation
52
+ * motion_signal.mp4 — the reference warped video
53
+ * mask.mp4 — grayscale mask of the motion
54
+ * prompt.txt — your prompt text
55
+
56
+
57
+ ## 🧾 License / Credits
58
+ Built with PySide6, OpenCV, and NumPy.
59
+ You own the images and exports you create with this tool.
60
+ Motivation for creating an easy-to-use tool from [Go-With-The-Flow](https://github.com/GoWithTheFlowPaper/gowiththeflowpaper.github.io).
GUIs/cut_and_drag.py ADDED
@@ -0,0 +1,1904 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Copyright 2025 Noam Rotstein
2
+ #
3
+ # Licensed under the Apache License, Version 2.0 (the "License");
4
+ # you may not use this file except in compliance with the License.
5
+ # You may obtain a copy of the License at
6
+ # http://www.apache.org/licenses/LICENSE-2.0
7
+ #
8
+ # Unless required by applicable law or agreed to in writing, software
9
+ # distributed under the License is distributed on an "AS IS" BASIS,
10
+ # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
11
+ # See the License for the specific language governing permissions and
12
+ # limitations under the License.
13
+
14
+ import os, sys, cv2, numpy as np
15
+ from dataclasses import dataclass, field
16
+ from typing import List, Optional, Tuple, Dict
17
+ import shutil, subprocess
18
+
19
+
20
+ from PySide6 import QtCore, QtGui, QtWidgets
21
+ from PySide6.QtCore import Qt, QPointF, Signal
22
+ from PySide6.QtGui import (
23
+ QImage, QPixmap, QPainterPath, QPen, QColor, QPainter, QPolygonF, QIcon
24
+ )
25
+ from PySide6.QtWidgets import (
26
+ QApplication, QMainWindow, QFileDialog, QGraphicsView, QGraphicsScene,
27
+ QGraphicsPixmapItem, QGraphicsPathItem, QGraphicsLineItem,
28
+ QToolBar, QLabel, QSpinBox, QWidget, QMessageBox,
29
+ QComboBox, QPushButton, QGraphicsEllipseItem, QFrame, QVBoxLayout, QSlider, QHBoxLayout,
30
+ QPlainTextEdit
31
+ )
32
+ import imageio
33
+
34
+ # ------------------------------
35
+ # Utility: numpy <-> QPixmap
36
+ # ------------------------------
37
+
38
+ def np_bgr_to_qpixmap(img_bgr: np.ndarray) -> QPixmap:
39
+ h, w = img_bgr.shape[:2]
40
+ img_rgb = cv2.cvtColor(img_bgr, cv2.COLOR_BGR2RGB)
41
+ qimg = QImage(img_rgb.data, w, h, img_rgb.strides[0], QImage.Format.Format_RGB888)
42
+ return QPixmap.fromImage(qimg.copy())
43
+
44
+ def np_rgba_to_qpixmap(img_rgba: np.ndarray) -> QPixmap:
45
+ h, w = img_rgba.shape[:2]
46
+ qimg = QImage(img_rgba.data, w, h, img_rgba.strides[0], QImage.Format.Format_RGBA8888)
47
+ return QPixmap.fromImage(qimg.copy())
48
+
49
+ # ------------------------------
50
+ # Image I/O + fit helpers
51
+ # ------------------------------
52
+
53
+ def load_first_frame(path: str) -> np.ndarray:
54
+ if not os.path.exists(path):
55
+ raise FileNotFoundError(path)
56
+ low = path.lower()
57
+ if low.endswith((".mp4", ".mov", ".avi", ".mkv")):
58
+ cap = cv2.VideoCapture(path)
59
+ ok, frame = cap.read()
60
+ cap.release()
61
+ if not ok:
62
+ raise RuntimeError("Failed to read first frame from video")
63
+ return frame
64
+ img = cv2.imread(path, cv2.IMREAD_COLOR)
65
+ if img is None:
66
+ raise RuntimeError("Failed to read image")
67
+ return img
68
+
69
+ def resize_then_center_crop(img: np.ndarray, target_h: int, target_w: int, interpolation=cv2.INTER_NEAREST) -> np.ndarray:
70
+ h, w = img.shape[:2]
71
+ scale = max(target_w / float(w), target_h / float(h))
72
+ new_w, new_h = int(round(w * scale)), int(round(h * scale))
73
+ resized = cv2.resize(img, (new_w, new_h), interpolation=interpolation)
74
+ y0 = (new_h - target_h) // 2
75
+ x0 = (new_w - target_w) // 2
76
+ return resized[y0:y0 + target_h, x0:x0 + target_w]
77
+
78
+ def fit_center_pad(img: np.ndarray, target_h: int, target_w: int, interpolation=cv2.INTER_NEAREST) -> np.ndarray:
79
+ h, w = img.shape[:2]
80
+ scale_h = target_h / float(h)
81
+ new_w_hfirst = int(round(w * scale_h))
82
+ new_h_hfirst = target_h
83
+ if new_w_hfirst <= target_w:
84
+ resized = cv2.resize(img, (new_w_hfirst, new_h_hfirst), interpolation=interpolation)
85
+ result = np.zeros((target_h, target_w, 3), dtype=np.uint8)
86
+ x0 = (target_w - new_w_hfirst) // 2
87
+ result[:, x0:x0 + new_w_hfirst] = resized
88
+ return result
89
+ scale_w = target_w / float(w)
90
+ new_w_wfirst = target_w
91
+ new_h_wfirst = int(round(h * scale_w))
92
+ resized = cv2.resize(img, (new_w_wfirst, new_h_wfirst), interpolation=interpolation)
93
+ result = np.zeros((target_h, target_w, 3), dtype=np.uint8)
94
+ y0 = (target_h - new_h_wfirst) // 2
95
+ result[y0:y0 + new_h_wfirst, :] = resized
96
+ return result
97
+
98
+ # ------------------------------
99
+ # Hue utilities
100
+ # ------------------------------
101
+
102
+ def apply_hue_shift_bgr(img_bgr: np.ndarray, hue_deg: float) -> np.ndarray:
103
+ """Rotate hue by hue_deg (degrees) in HSV space. S and V unchanged."""
104
+ if abs(hue_deg) < 1e-6:
105
+ return img_bgr.copy()
106
+ hsv = cv2.cvtColor(img_bgr, cv2.COLOR_BGR2HSV)
107
+ h = hsv[:, :, 0].astype(np.int16)
108
+ offset = int(round((hue_deg / 360.0) * 179.0))
109
+ h = (h + offset) % 180
110
+ hsv[:, :, 0] = h.astype(np.uint8)
111
+ return cv2.cvtColor(hsv, cv2.COLOR_HSV2BGR)
112
+
113
+ # ------------------------------
114
+ # Compositing / warping (final)
115
+ # ------------------------------
116
+
117
+ def alpha_over(bg_bgr: np.ndarray, fg_rgba: np.ndarray) -> np.ndarray:
118
+ a = (fg_rgba[:, :, 3:4].astype(np.float32) / 255.0)
119
+ if a.max() == 0:
120
+ return bg_bgr.copy()
121
+ fg = fg_rgba[:, :, :3].astype(np.float32)
122
+ bg = bg_bgr.astype(np.float32)
123
+ out = fg * a + bg * (1.0 - a)
124
+ return np.clip(out, 0, 255).astype(np.uint8)
125
+
126
+ def inpaint_background(image_bgr: np.ndarray, mask_bool: np.ndarray) -> np.ndarray:
127
+ mask = (mask_bool.astype(np.uint8) * 255)
128
+ return cv2.inpaint(image_bgr, mask, 3, cv2.INPAINT_TELEA)
129
+
130
+ def animate_polygon(image_bgr, polygon_xy, path_xy, scales, rotations_deg, interp=cv2.INTER_LINEAR, origin_xy=None):
131
+ """
132
+ Returns list of RGBA frames and list of transformed polygons per frame.
133
+ Uses BORDER_REPLICATE so off-canvas doesn't appear black.
134
+ """
135
+ h, w = image_bgr.shape[:2]
136
+ frames_rgba = []
137
+ polys_per_frame = []
138
+
139
+ if origin_xy is None:
140
+ if len(path_xy) == 0:
141
+ raise ValueError("animate_polygon: path_xy is empty and origin_xy not provided.")
142
+ origin = np.asarray(path_xy[0], dtype=np.float32)
143
+ else:
144
+ origin = np.asarray(origin_xy, dtype=np.float32)
145
+
146
+ for i in range(len(path_xy)):
147
+ theta = np.deg2rad(rotations_deg[i]).astype(np.float32)
148
+ s = float(scales[i])
149
+ a11 = s * np.cos(theta); a12 = -s * np.sin(theta)
150
+ a21 = s * np.sin(theta); a22 = s * np.cos(theta)
151
+ tx = path_xy[i, 0] - (a11 * origin[0] + a12 * origin[1])
152
+ ty = path_xy[i, 1] - (a21 * origin[0] + a22 * origin[1])
153
+ M = np.array([[a11, a12, tx], [a21, a22, ty]], dtype=np.float32)
154
+
155
+ warped = cv2.warpAffine(image_bgr, M, (w, h), flags=interp,
156
+ borderMode=cv2.BORDER_REPLICATE)
157
+
158
+ poly = np.asarray(polygon_xy, dtype=np.float32)
159
+ pts1 = np.hstack([poly, np.ones((len(poly), 1), dtype=np.float32)])
160
+ poly_t = (M @ pts1.T).T
161
+ polys_per_frame.append(poly_t.astype(np.float32))
162
+
163
+ mask = np.zeros((h, w), dtype=np.uint8)
164
+ cv2.fillPoly(mask, [poly_t.astype(np.int32)], 255)
165
+
166
+ rgba = np.zeros((h, w, 4), dtype=np.uint8)
167
+ rgba[:, :, :3] = warped
168
+ rgba[:, :, 3] = mask
169
+ frames_rgba.append(rgba)
170
+
171
+ return frames_rgba, polys_per_frame
172
+
173
+ def composite_frames(background_bgr, list_of_layer_frame_lists):
174
+ frames = []
175
+ T = len(list_of_layer_frame_lists[0]) if list_of_layer_frame_lists else 0
176
+ for t in range(T):
177
+ frame = background_bgr.copy()
178
+ for layer in list_of_layer_frame_lists:
179
+ frame = alpha_over(frame, layer[t])
180
+ frames.append(frame)
181
+ return frames
182
+
183
+ def save_video_mp4(frames_bgr, path, fps=24):
184
+ """
185
+ Write MP4 using imageio (FFmpeg backend) with H.264 + yuv420p so it works on macOS/QuickTime.
186
+ - Converts BGR->RGB (imageio expects RGB)
187
+ - Enforces even width/height (needed for yuv420p)
188
+ - Tags BT.709 and faststart for smooth playback
189
+ """
190
+ if not frames_bgr:
191
+ raise ValueError("No frames to save")
192
+
193
+ # Validate and normalize frames (to RGB uint8 and consistent size)
194
+ h, w = frames_bgr[0].shape[:2]
195
+ out_frames = []
196
+ for f in frames_bgr:
197
+ if f is None:
198
+ raise RuntimeError("Encountered None frame")
199
+ # Accept gray/BGR/BGRA; convert to BGR then to RGB
200
+ if f.ndim == 2:
201
+ f = cv2.cvtColor(f, cv2.COLOR_GRAY2BGR)
202
+ elif f.shape[2] == 4:
203
+ f = cv2.cvtColor(f, cv2.COLOR_BGRA2BGR)
204
+ elif f.shape[2] != 3:
205
+ raise RuntimeError("Frames must be gray, BGR, or BGRA")
206
+ if f.shape[:2] != (h, w):
207
+ raise RuntimeError("Frame size mismatch during save.")
208
+ if f.dtype != np.uint8:
209
+ f = np.clip(f, 0, 255).astype(np.uint8)
210
+ # BGR -> RGB for imageio/ffmpeg
211
+ out_frames.append(cv2.cvtColor(f, cv2.COLOR_BGR2RGB))
212
+
213
+ # Enforce even dims for yuv420p
214
+ hh = h - (h % 2)
215
+ ww = w - (w % 2)
216
+ if (hh != h) or (ww != w):
217
+ out_frames = [frm[:hh, :ww] for frm in out_frames]
218
+ h, w = hh, ww
219
+
220
+ # Try libx264 first; fall back to MPEG-4 Part 2 if libx264 missing
221
+ ffmpeg_common = ['-movflags', '+faststart',
222
+ '-colorspace', 'bt709', '-color_primaries', 'bt709', '-color_trc', 'bt709',
223
+ '-tag:v', 'avc1'] # helps QuickTime recognize H.264 properly
224
+ try:
225
+ writer = imageio.get_writer(
226
+ path, format='ffmpeg', fps=float(fps),
227
+ codec='libx264', pixelformat='yuv420p',
228
+ ffmpeg_params=ffmpeg_common
229
+ )
230
+ except Exception:
231
+ # Fallback: MPEG-4 (still Mac-friendly, a bit larger/softer)
232
+ writer = imageio.get_writer(
233
+ path, format='ffmpeg', fps=float(fps),
234
+ codec='mpeg4', pixelformat='yuv420p',
235
+ ffmpeg_params=['-movflags', '+faststart']
236
+ )
237
+
238
+ try:
239
+ for frm in out_frames:
240
+ writer.append_data(frm)
241
+ finally:
242
+ writer.close()
243
+
244
+ if not os.path.exists(path) or os.path.getsize(path) == 0:
245
+ raise RuntimeError("imageio/ffmpeg produced an empty file. Check that FFmpeg is available.")
246
+ return path
247
+
248
+
249
+
250
+ # ------------------------------
251
+ # Data structures
252
+ # ------------------------------
253
+
254
+ PALETTE = [
255
+ QColor(255, 99, 99), # red
256
+ QColor(99, 155, 255), # blue
257
+ QColor(120, 220, 120), # green
258
+ QColor(255, 200, 80), # orange
259
+ QColor(200, 120, 255), # purple
260
+ QColor(120, 255, 255) # cyan
261
+ ]
262
+
263
+ @dataclass
264
+ class Keyframe:
265
+ pos: np.ndarray # (2,)
266
+ rot_deg: float
267
+ scale: float
268
+ hue_deg: float = 0.0
269
+
270
+ @dataclass
271
+ class Layer:
272
+ name: str
273
+ source_bgr: np.ndarray
274
+ polygon_xy: Optional[np.ndarray] = None
275
+ origin_local_xy: Optional[np.ndarray] = None # bbox center in item coords
276
+ is_external: bool = False
277
+ pixmap_item: Optional[QtWidgets.QGraphicsPixmapItem] = None
278
+ outline_item: Optional[QGraphicsPathItem] = None
279
+ handle_items: List[QtWidgets.QGraphicsItem] = field(default_factory=list)
280
+ keyframes: List[Keyframe] = field(default_factory=list)
281
+ path_lines: List[QGraphicsLineItem] = field(default_factory=list)
282
+ preview_line: Optional[QGraphicsLineItem] = None
283
+ color: QColor = field(default_factory=lambda: QColor(255, 99, 99))
284
+
285
+ def has_polygon(self) -> bool:
286
+ return self.polygon_xy is not None and len(self.polygon_xy) >= 3
287
+
288
+ # ------------------------------
289
+ # Handles (scale corners + rotate dot)
290
+ # ------------------------------
291
+
292
+ class HandleBase(QGraphicsEllipseItem):
293
+ def __init__(self, r: float, color: QColor, parent=None):
294
+ super().__init__(-r, -r, 2*r, 2*r, parent)
295
+ self.setBrush(color)
296
+ pen = QPen(QColor(0, 0, 0), 1)
297
+ pen.setCosmetic(True) # (optional) keep 1px outline at any zoom
298
+ self.setPen(pen)
299
+ self.setFlag(QGraphicsEllipseItem.ItemIsMovable, False)
300
+ self.setAcceptHoverEvents(True)
301
+ self.setZValue(2000)
302
+ self._item: Optional[QGraphicsPixmapItem] = None
303
+
304
+ # 👇 The key line: draw in device coords (no scaling with the polygon)
305
+ self.setFlag(QGraphicsEllipseItem.ItemIgnoresTransformations, True)
306
+
307
+ def set_item(self, item: QGraphicsPixmapItem):
308
+ self._item = item
309
+
310
+ def origin_scene(self) -> QPointF:
311
+ return self._item.mapToScene(self._item.transformOriginPoint())
312
+
313
+ class ScaleHandle(HandleBase):
314
+ def mousePressEvent(self, event: QtWidgets.QGraphicsSceneMouseEvent):
315
+ if not self._item: return super().mousePressEvent(event)
316
+ self._start_scale = self._item.scale() if self._item.scale() != 0 else 1.0
317
+ self._origin_scene = self.origin_scene()
318
+ v0 = event.scenePos() - self._origin_scene
319
+ self._d0 = max(1e-6, (v0.x()*v0.x() + v0.y()*v0.y())**0.5)
320
+ event.accept()
321
+ def mouseMoveEvent(self, event: QtWidgets.QGraphicsSceneMouseEvent):
322
+ if not self._item: return super().mouseMoveEvent(event)
323
+ v = event.scenePos() - self._origin_scene
324
+ d = max(1e-6, (v.x()*v.x() + v.y()*v.y())**0.5)
325
+ s = float(self._start_scale * (d / self._d0))
326
+ s = float(np.clip(s, 0.05, 10.0))
327
+ self._item.setScale(s)
328
+ event.accept()
329
+
330
+ class RotateHandle(HandleBase):
331
+ def mousePressEvent(self, event: QtWidgets.QGraphicsSceneMouseEvent):
332
+ if not self._item: return super().mousePressEvent(event)
333
+ self._start_rot = self._item.rotation()
334
+ self._origin_scene = self.origin_scene()
335
+ v0 = event.scenePos() - self._origin_scene
336
+ self._a0 = np.degrees(np.arctan2(v0.y(), v0.x()))
337
+ event.accept()
338
+ def mouseMoveEvent(self, event: QtWidgets.QGraphicsSceneMouseEvent):
339
+ if not self._item: return super().mouseMoveEvent(event)
340
+ v = event.scenePos() - self._origin_scene
341
+ a = np.degrees(np.arctan2(v.y(), v.x()))
342
+ delta = a - self._a0
343
+ r = self._start_rot + delta
344
+ if r > 180: r -= 360
345
+ if r < -180: r += 360
346
+ self._item.setRotation(r)
347
+ event.accept()
348
+
349
+ # ------------------------------
350
+ # Pixmap item that notifies on transform changes
351
+ # ------------------------------
352
+
353
+ class NotifyingPixmapItem(QGraphicsPixmapItem):
354
+ def __init__(self, pm: QPixmap, on_change_cb=None):
355
+ super().__init__(pm)
356
+ self._on_change_cb = on_change_cb
357
+ self.setFlag(QGraphicsPixmapItem.ItemSendsGeometryChanges, True)
358
+ def itemChange(self, change, value):
359
+ ret = super().itemChange(change, value)
360
+ if change in (QGraphicsPixmapItem.ItemPositionHasChanged,
361
+ QGraphicsPixmapItem.ItemRotationHasChanged,
362
+ QGraphicsPixmapItem.ItemScaleHasChanged):
363
+ if callable(self._on_change_cb):
364
+ self._on_change_cb()
365
+ return ret
366
+
367
+ # ------------------------------
368
+ # Canvas
369
+ # ------------------------------
370
+
371
+ class Canvas(QGraphicsView):
372
+ MODE_IDLE = 0
373
+ MODE_DRAW_POLY = 1
374
+
375
+ polygon_finished = Signal(bool)
376
+ end_segment_requested = Signal()
377
+
378
+ def __init__(self, parent=None):
379
+ super().__init__(parent)
380
+ self.setRenderHint(QtGui.QPainter.Antialiasing, True)
381
+ self.setRenderHint(QtGui.QPainter.SmoothPixmapTransform, False) # NN preview
382
+ self.setTransformationAnchor(QGraphicsView.AnchorUnderMouse)
383
+ self.setDragMode(QGraphicsView.NoDrag)
384
+ self.scene = QGraphicsScene(self)
385
+ self.setScene(self.scene)
386
+
387
+ self.base_bgr = None
388
+ self.base_preview_bgr = None
389
+ self.base_item = None
390
+ self.layers: List[Layer] = []
391
+ self.current_layer: Optional[Layer] = None
392
+ self.layer_index = 0
393
+
394
+ self.mode = Canvas.MODE_IDLE
395
+ self.temp_points: List[QPointF] = []
396
+ self.temp_path_item: Optional[QGraphicsPathItem] = None
397
+ self.first_click_marker: Optional[QGraphicsEllipseItem] = None
398
+
399
+ self.fit_mode_combo = None
400
+ self.target_w = 720
401
+ self.target_h = 480
402
+
403
+ # hue preview for current segment (degrees)
404
+ self.current_segment_hue_deg: float = 0.0
405
+
406
+ # Demo playback
407
+ self.play_timer = QtCore.QTimer(self)
408
+ self.play_timer.timeout.connect(self._on_play_tick)
409
+ self.play_frames: List[np.ndarray] = []
410
+ self.play_index = 0
411
+ self.player_item: Optional[QGraphicsPixmapItem] = None
412
+
413
+ self.setMouseTracking(True)
414
+ self.setHorizontalScrollBarPolicy(Qt.ScrollBarAsNeeded)
415
+ self.setVerticalScrollBarPolicy(Qt.ScrollBarAsNeeded)
416
+
417
+
418
+ # ------------ small helpers ------------
419
+ def _remove_if_in_scene(self, item):
420
+ if item is None:
421
+ return
422
+ try:
423
+ sc = item.scene()
424
+ if sc is None:
425
+ return # already detached or removed
426
+ if sc is self.scene:
427
+ self.scene.removeItem(item)
428
+ else:
429
+ # If the item belongs to a different scene, remove it from THAT scene.
430
+ sc.removeItem(item)
431
+ except Exception:
432
+ pass
433
+
434
+
435
+ def _apply_pose_from_origin_scene(self, item, origin_scene_qp: QPointF, rot: float, scale: float):
436
+ item.setRotation(float(rot))
437
+ item.setScale(float(scale) if scale != 0 else 1.0)
438
+ new_origin = item.mapToScene(item.transformOriginPoint())
439
+ d = origin_scene_qp - new_origin
440
+ item.setPos(item.pos() + d)
441
+
442
+ # ------------ Icons ------------
443
+ def make_pentagon_icon(self) -> QIcon:
444
+ pm = QPixmap(22, 22)
445
+ pm.fill(Qt.GlobalColor.transparent)
446
+ p = QPainter(pm)
447
+ p.setRenderHint(QPainter.Antialiasing, True)
448
+ pen = QPen(QColor(40, 40, 40)); pen.setWidth(2)
449
+ p.setPen(pen)
450
+ r = 8; cx, cy = 11, 11
451
+ pts = []
452
+ for i in range(5):
453
+ ang = -90 + i * 72
454
+ rad = np.radians(ang)
455
+ pts.append(QPointF(cx + r * np.cos(rad), cy + r * np.sin(rad)))
456
+ p.drawPolygon(QPolygonF(pts))
457
+ p.end()
458
+ return QIcon(pm)
459
+
460
+ # -------- Fit helpers --------
461
+ def _apply_fit(self, img: np.ndarray) -> np.ndarray:
462
+ mode = 'Center Crop'
463
+ if self.fit_mode_combo is not None:
464
+ mode = self.fit_mode_combo.currentText()
465
+ if mode == 'Center Pad':
466
+ return fit_center_pad(img, self.target_h, self.target_w, interpolation=cv2.INTER_NEAREST)
467
+ else:
468
+ return resize_then_center_crop(img, self.target_h, self.target_w, interpolation=cv2.INTER_NEAREST)
469
+
470
+ def _refresh_inpaint_preview(self):
471
+ if self.base_bgr is None:
472
+ return
473
+ H, W = self.base_bgr.shape[:2]
474
+ total_mask = np.zeros((H, W), dtype=bool)
475
+ for L in self.layers:
476
+ if not L.has_polygon() or L.is_external:
477
+ continue
478
+ poly0 = L.polygon_xy.astype(np.int32)
479
+ mask = np.zeros((H, W), dtype=np.uint8)
480
+ cv2.fillPoly(mask, [poly0], 255)
481
+ total_mask |= (mask > 0)
482
+ inpainted = inpaint_background(self.base_bgr, total_mask)
483
+ self.base_preview_bgr = inpainted.copy()
484
+ if self.base_item is not None:
485
+ self.base_item.setPixmap(np_bgr_to_qpixmap(self.base_preview_bgr))
486
+
487
+ # -------- Scene expansion helpers --------
488
+ def _expand_scene_to_item(self, item: QtWidgets.QGraphicsItem, margin: int = 120, center: bool = True):
489
+ if item is None:
490
+ return
491
+ try:
492
+ local_rect = item.boundingRect()
493
+ poly = item.mapToScene(local_rect)
494
+ r = poly.boundingRect()
495
+ except Exception:
496
+ r = self.scene.sceneRect()
497
+ sr = self.scene.sceneRect().united(r.adjusted(-margin, -margin, margin, margin))
498
+ self.scene.setSceneRect(sr)
499
+ self.ensureVisible(r.adjusted(-20, -20, 20, 20))
500
+ if center:
501
+ try:
502
+ self.centerOn(item)
503
+ except Exception:
504
+ pass
505
+
506
+ # -------- Base image --------
507
+ def set_base_image(self, bgr_original: np.ndarray):
508
+ self.scene.clear()
509
+ for L in self.layers:
510
+ L.handle_items.clear(); L.path_lines.clear(); L.preview_line = None
511
+ self.layers.clear(); self.current_layer = None
512
+ base_for_save = resize_then_center_crop(bgr_original, self.target_h, self.target_w, interpolation=cv2.INTER_AREA)
513
+ self.base_bgr = base_for_save.copy()
514
+ self.base_preview_bgr = self._apply_fit(bgr_original)
515
+ pm = np_bgr_to_qpixmap(self.base_preview_bgr)
516
+ self.base_item = self.scene.addPixmap(pm)
517
+ self.base_item.setZValue(0)
518
+ self.base_item.setTransformationMode(Qt.FastTransformation) # NN
519
+ self.setSceneRect(0, 0, pm.width(), pm.height())
520
+
521
+ # -------- External sprite layer (no keyframe yet) --------
522
+ def add_external_sprite_layer(self, raw_bgr: np.ndarray) -> 'Layer':
523
+ if self.base_bgr is None:
524
+ return None
525
+ H, W = self.base_bgr.shape[:2]
526
+ h0, w0 = raw_bgr.shape[:2]
527
+ target_h = int(0.6 * H)
528
+ scale = target_h / float(h0)
529
+ ew = int(round(w0 * scale))
530
+ eh = int(round(h0 * scale))
531
+ ext_small = cv2.resize(raw_bgr, (ew, eh), interpolation=cv2.INTER_AREA)
532
+
533
+ # pack onto same canvas size as base
534
+ px = (W - ew) // 2
535
+ py = (H - eh) // 2
536
+ source_bgr = np.zeros((H, W, 3), dtype=np.uint8)
537
+ x0 = max(px, 0); y0 = max(py, 0)
538
+ x1 = min(px + ew, W); y1 = min(py + eh, H)
539
+ if x0 < x1 and y0 < y1:
540
+ sx0 = x0 - px; sy0 = y0 - py
541
+ sx1 = sx0 + (x1 - x0); sy1 = sy0 + (y1 - y0)
542
+ source_bgr[y0:y1, x0:x1] = ext_small[sy0:sy1, sx0:sx1]
543
+
544
+ rect_poly = np.array([[px, py], [px+ew, py], [px+ew, py+eh], [px, py+eh]], dtype=np.float32)
545
+ cx, cy = px + ew/2.0, py + eh/2.0
546
+
547
+ mask_rect = np.zeros((H, W), dtype=np.uint8)
548
+ cv2.fillPoly(mask_rect, [rect_poly.astype(np.int32)], 255)
549
+ rgba = np.dstack([cv2.cvtColor(source_bgr, cv2.COLOR_BGR2RGB), mask_rect])
550
+ pm = np_rgba_to_qpixmap(rgba)
551
+
552
+ color = PALETTE[self.layer_index % len(PALETTE)]; self.layer_index += 1
553
+ L = Layer(name=f"Layer {len(self.layers)+1} (ext)", source_bgr=source_bgr, is_external=True,
554
+ polygon_xy=rect_poly.copy(), origin_local_xy=np.array([cx, cy], dtype=np.float32), color=color)
555
+ self.layers.append(L); self.current_layer = L
556
+
557
+ def on_change():
558
+ if L.keyframes:
559
+ self._ensure_preview_line(L)
560
+ self._relayout_handles(L)
561
+
562
+ item = NotifyingPixmapItem(pm, on_change_cb=on_change)
563
+ item.setZValue(10 + len(self.layers))
564
+ item.setFlag(QGraphicsPixmapItem.ItemIsMovable, True) # place first
565
+ item.setFlag(QGraphicsPixmapItem.ItemIsSelectable, False)
566
+ item.setTransformationMode(Qt.FastTransformation)
567
+ item.setShapeMode(QGraphicsPixmapItem.ShapeMode.MaskShape)
568
+ item.setTransformOriginPoint(QPointF(cx, cy))
569
+ self.scene.addItem(item); L.pixmap_item = item
570
+
571
+ # start slightly to the left, still visible
572
+ min_vis = min(max(40, ew // 5), W // 2)
573
+ outside_x = (min_vis - (px + ew))
574
+ item.setPos(outside_x, 0)
575
+
576
+ # rect outline + handles (for initial placement)
577
+ qpath = QPainterPath(QPointF(rect_poly[0,0], rect_poly[0,1]))
578
+ for i in range(1, len(rect_poly)): qpath.lineTo(QPointF(rect_poly[i,0], rect_poly[i,1]))
579
+ qpath.closeSubpath()
580
+ outline = QGraphicsPathItem(qpath, parent=item)
581
+ outline.setPen(QPen(L.color, 2, Qt.DashLine))
582
+ outline.setZValue(item.zValue() + 1)
583
+ L.outline_item = outline
584
+ self._create_handles_for_layer(L)
585
+
586
+ self._expand_scene_to_item(item, center=True)
587
+ return L
588
+
589
+ def place_external_initial_keyframe(self, L: 'Layer'):
590
+ if not (L and L.pixmap_item): return
591
+ origin_scene = L.pixmap_item.mapToScene(L.pixmap_item.transformOriginPoint())
592
+ L.keyframes.append(Keyframe(
593
+ pos=np.array([origin_scene.x(), origin_scene.y()], dtype=np.float32),
594
+ rot_deg=float(L.pixmap_item.rotation()),
595
+ scale=float(L.pixmap_item.scale()) if L.pixmap_item.scale()!=0 else 1.0,
596
+ hue_deg=0.0
597
+ ))
598
+ self._ensure_preview_line(L)
599
+
600
+ # -------- Polygon authoring --------
601
+ def new_layer_from_source(self, name: str, source_bgr: np.ndarray, is_external: bool):
602
+ color = PALETTE[self.layer_index % len(PALETTE)]; self.layer_index += 1
603
+ layer = Layer(name=name, source_bgr=source_bgr.copy(), is_external=is_external, color=color)
604
+ self.layers.append(layer); self.current_layer = layer
605
+ self.start_draw_polygon(preserve_motion=False)
606
+
607
+ def start_draw_polygon(self, preserve_motion: bool):
608
+ L = self.current_layer
609
+ if L is None: return
610
+ if preserve_motion:
611
+ for it in L.handle_items:
612
+ it.setVisible(False)
613
+ else:
614
+ # We are re-drawing on the current (base) layer: clear visuals first.
615
+ # REMOVE CHILDREN FIRST (outline/handles/preview), THEN parent pixmap.
616
+ if L.outline_item is not None:
617
+ self._remove_if_in_scene(L.outline_item)
618
+ L.outline_item = None
619
+ for it in L.handle_items:
620
+ self._remove_if_in_scene(it)
621
+ L.handle_items = []
622
+ if L.preview_line is not None:
623
+ self._remove_if_in_scene(L.preview_line)
624
+ L.preview_line = None
625
+ if L.pixmap_item is not None:
626
+ self._remove_if_in_scene(L.pixmap_item)
627
+ L.pixmap_item = None
628
+
629
+ L.path_lines = []
630
+ L.keyframes.clear()
631
+ L.polygon_xy = None
632
+ self.mode = Canvas.MODE_DRAW_POLY
633
+ self.temp_points = []
634
+ if self.temp_path_item is not None:
635
+ self.scene.removeItem(self.temp_path_item); self.temp_path_item = None
636
+ if self.first_click_marker is not None:
637
+ self.scene.removeItem(self.first_click_marker); self.first_click_marker = None
638
+ # reset hue preview for new segment
639
+ self.current_segment_hue_deg = 0.0
640
+
641
+ def _compute_ext_rect_from_source(self, src_bgr: np.ndarray) -> np.ndarray:
642
+ ys, xs = np.where(np.any(src_bgr != 0, axis=2))
643
+ if len(xs) == 0 or len(ys) == 0:
644
+ return np.array([[0,0],[0,0],[0,0],[0,0]], dtype=np.float32)
645
+ x0, x1 = int(xs.min()), int(xs.max())
646
+ y0, y1 = int(ys.min()), int(ys.max())
647
+ return np.array([[x0,y0],[x1,y0],[x1,y1],[x0,y1]], dtype=np.float32)
648
+
649
+ def _make_rgba_from_bgr_and_maskpoly(self, bgr: np.ndarray, poly: np.ndarray) -> np.ndarray:
650
+ H, W = bgr.shape[:2]
651
+ mask = np.zeros((H, W), dtype=np.uint8)
652
+ if poly is not None and poly.size:
653
+ cv2.fillPoly(mask, [poly.astype(np.int32)], 255)
654
+ rgba = np.dstack([cv2.cvtColor(bgr, cv2.COLOR_BGR2RGB), mask])
655
+ return rgba
656
+
657
+ def _add_static_external_item(self, bgr_inpainted: np.ndarray, rect_poly: np.ndarray,
658
+ kf0: 'Keyframe', z_under: float, color: QColor) -> QGraphicsPixmapItem:
659
+ rgba = self._make_rgba_from_bgr_and_maskpoly(bgr_inpainted, rect_poly)
660
+ pm = np_rgba_to_qpixmap(rgba)
661
+ item = QGraphicsPixmapItem(pm)
662
+ item.setZValue(max(1.0, z_under - 0.2))
663
+ item.setTransformationMode(Qt.FastTransformation)
664
+ item.setShapeMode(QGraphicsPixmapItem.ShapeMode.MaskShape)
665
+ cx = (rect_poly[:,0].min() + rect_poly[:,0].max())/2.0
666
+ cy = (rect_poly[:,1].min() + rect_poly[:,1].max())/2.0
667
+ item.setTransformOriginPoint(QPointF(cx, cy))
668
+ self.scene.addItem(item)
669
+ self._apply_pose_from_origin_scene(item, QPointF(kf0.pos[0], kf0.pos[1]), kf0.rot_deg, kf0.scale)
670
+ path = QPainterPath(QPointF(rect_poly[0,0], rect_poly[0,1]))
671
+ for i in range(1, len(rect_poly)): path.lineTo(QPointF(rect_poly[i,0], rect_poly[i,1]))
672
+ path.closeSubpath()
673
+ outline = QGraphicsPathItem(path, parent=item)
674
+ outline.setPen(QPen(color, 1, Qt.DashLine))
675
+ outline.setZValue(item.zValue() + 0.01)
676
+ self._expand_scene_to_item(item, center=False)
677
+ return item
678
+
679
+ def _update_current_item_hue_preview(self):
680
+ """Live hue preview for current moving polygon sprite."""
681
+ L = self.current_layer
682
+ if not (L and L.pixmap_item and L.polygon_xy is not None):
683
+ return
684
+ # Rebuild pixmap of moving item with current hue (full strength preview)
685
+ bgr = L.source_bgr
686
+ if abs(self.current_segment_hue_deg) > 1e-6:
687
+ bgr = apply_hue_shift_bgr(bgr, self.current_segment_hue_deg)
688
+ rgba = self._make_rgba_from_bgr_and_maskpoly(bgr, L.polygon_xy)
689
+ L.pixmap_item.setPixmap(np_rgba_to_qpixmap(rgba))
690
+
691
+ def finish_polygon(self, preserve_motion: bool) -> bool:
692
+ L = self.current_layer
693
+ if L is None or self.mode != Canvas.MODE_DRAW_POLY: return False
694
+ if len(self.temp_points) < 3: return False
695
+
696
+ pts_scene = [QtCore.QPointF(p) for p in self.temp_points]
697
+
698
+ if preserve_motion and L.pixmap_item is not None and L.is_external:
699
+ # ===== EXTERNAL split: remove old rect item, add static rect-with-hole and new moving polygon =====
700
+ old_item = L.pixmap_item
701
+
702
+ # polygon in the old item's LOCAL coords (source_bgr space)
703
+ pts_local_qt = [old_item.mapFromScene(p) for p in pts_scene]
704
+ pts_local = np.array([[p.x(), p.y()] for p in pts_local_qt], dtype=np.float32)
705
+
706
+ # origin for moving poly = polygon bbox center
707
+ x0, y0 = pts_local.min(axis=0)
708
+ x1, y1 = pts_local.max(axis=0)
709
+ cx_local, cy_local = (x0 + x1) / 2.0, (y0 + y1) / 2.0
710
+
711
+ rect_poly_prev = (L.polygon_xy.copy()
712
+ if (L.polygon_xy is not None and len(L.polygon_xy) >= 3)
713
+ else self._compute_ext_rect_from_source(L.source_bgr))
714
+
715
+ # cache pose / z
716
+ old_origin_scene = old_item.mapToScene(old_item.transformOriginPoint())
717
+ old_rot = old_item.rotation()
718
+ old_scale = old_item.scale() if old_item.scale() != 0 else 1.0
719
+ old_z = old_item.zValue()
720
+
721
+ # build RGBA for moving polygon & static rect-with-hole
722
+ H, W = L.source_bgr.shape[:2]
723
+ rgb_full = cv2.cvtColor(L.source_bgr, cv2.COLOR_BGR2RGB)
724
+
725
+ mov_mask = np.zeros((H, W), dtype=np.uint8); cv2.fillPoly(mov_mask, [pts_local.astype(np.int32)], 255)
726
+ mov_rgba = np.dstack([rgb_full, mov_mask])
727
+
728
+ hole_mask = np.zeros((H, W), dtype=np.uint8); cv2.fillPoly(hole_mask, [pts_local.astype(np.int32)], 255)
729
+ inpainted_ext = inpaint_background(L.source_bgr, hole_mask > 0)
730
+ rect_mask = np.zeros((H, W), dtype=np.uint8); cv2.fillPoly(rect_mask, [rect_poly_prev.astype(np.int32)], 255)
731
+ static_rgba = np.dstack([cv2.cvtColor(inpainted_ext, cv2.COLOR_BGR2RGB), rect_mask])
732
+
733
+ # remove old outline/handles and the old item itself
734
+ if L.outline_item is not None:
735
+ self._remove_if_in_scene(L.outline_item); L.outline_item = None
736
+ for it in L.handle_items:
737
+ self._remove_if_in_scene(it)
738
+ L.handle_items = []
739
+ self._remove_if_in_scene(old_item)
740
+
741
+ # STATIC rect (non-movable, below)
742
+ kf0 = L.keyframes[0] if L.keyframes else Keyframe(
743
+ pos=np.array([old_origin_scene.x(), old_origin_scene.y()], dtype=np.float32),
744
+ rot_deg=old_rot, scale=old_scale, hue_deg=0.0
745
+ )
746
+ static_item = QGraphicsPixmapItem(np_rgba_to_qpixmap(static_rgba))
747
+ static_item.setZValue(max(1.0, old_z - 0.2))
748
+ static_item.setTransformationMode(Qt.FastTransformation)
749
+ static_item.setShapeMode(QGraphicsPixmapItem.ShapeMode.MaskShape)
750
+ rcx = (rect_poly_prev[:,0].min() + rect_poly_prev[:,0].max())/2.0
751
+ rcy = (rect_poly_prev[:,1].min() + rect_poly_prev[:,1].max())/2.0
752
+ static_item.setTransformOriginPoint(QPointF(rcx, rcy))
753
+ self.scene.addItem(static_item)
754
+ self._apply_pose_from_origin_scene(static_item, QPointF(kf0.pos[0], kf0.pos[1]), kf0.rot_deg, kf0.scale)
755
+ # dashed outline for static
756
+ qpath_rect = QPainterPath(QPointF(rect_poly_prev[0,0], rect_poly_prev[0,1]))
757
+ for i in range(1, len(rect_poly_prev)): qpath_rect.lineTo(QPointF(rect_poly_prev[i,0], rect_poly_prev[i,1]))
758
+ qpath_rect.closeSubpath()
759
+ outline_static = QGraphicsPathItem(qpath_rect, parent=static_item)
760
+ outline_static.setPen(QPen(L.color, 1, Qt.DashLine))
761
+ outline_static.setZValue(static_item.zValue() + 0.01)
762
+
763
+ # NEW MOVING polygon (on top)
764
+ def on_change():
765
+ if L.keyframes:
766
+ self._ensure_preview_line(L)
767
+ self._relayout_handles(L)
768
+
769
+ poly_item = NotifyingPixmapItem(np_rgba_to_qpixmap(mov_rgba), on_change_cb=on_change)
770
+ poly_item.setZValue(old_z + 0.2)
771
+ poly_item.setFlag(QGraphicsPixmapItem.ItemIsMovable, True)
772
+ poly_item.setFlag(QGraphicsPixmapItem.ItemIsSelectable, False)
773
+ poly_item.setTransformationMode(Qt.FastTransformation)
774
+ poly_item.setShapeMode(QGraphicsPixmapItem.ShapeMode.MaskShape)
775
+ poly_item.setTransformOriginPoint(QPointF(cx_local, cy_local))
776
+ self.scene.addItem(poly_item)
777
+ self._apply_pose_from_origin_scene(poly_item, old_origin_scene, old_rot, old_scale)
778
+
779
+ # outline/handles on moving polygon
780
+ qpath = QPainterPath(QPointF(pts_local[0,0], pts_local[0,1]))
781
+ for i in range(1, len(pts_local)): qpath.lineTo(QPointF(pts_local[i,0], pts_local[i,1]))
782
+ qpath.closeSubpath()
783
+ outline_move = QGraphicsPathItem(qpath, parent=poly_item)
784
+ outline_move.setPen(QPen(L.color, 2))
785
+ outline_move.setZValue(poly_item.zValue() + 1)
786
+
787
+ L.polygon_xy = pts_local
788
+ L.origin_local_xy = np.array([cx_local, cy_local], dtype=np.float32)
789
+ L.pixmap_item = poly_item
790
+ L.outline_item = outline_move
791
+ self._create_handles_for_layer(L)
792
+ self._ensure_preview_line(L)
793
+
794
+ # live hue preview starts neutral for new poly
795
+ self.current_segment_hue_deg = 0.0
796
+ self._expand_scene_to_item(poly_item, center=True)
797
+
798
+ else:
799
+ # ===== BASE image polygon path =====
800
+ pts = np.array([[p.x(), p.y()] for p in pts_scene], dtype=np.float32)
801
+ x0, y0 = pts.min(axis=0); x1, y1 = pts.max(axis=0)
802
+ cx, cy = (x0+x1)/2.0, (y0+y1)/2.0
803
+
804
+ L.polygon_xy = pts
805
+ L.origin_local_xy = np.array([cx, cy], dtype=np.float32)
806
+
807
+ rgb = cv2.cvtColor(L.source_bgr, cv2.COLOR_BGR2RGB)
808
+ h, w = rgb.shape[:2]
809
+ mask = np.zeros((h, w), dtype=np.uint8)
810
+ cv2.fillPoly(mask, [pts.astype(np.int32)], 255)
811
+ rgba = np.dstack([rgb, mask])
812
+ pm = np_rgba_to_qpixmap(rgba)
813
+
814
+ def on_change():
815
+ if L.keyframes:
816
+ self._ensure_preview_line(L)
817
+ self._relayout_handles(L)
818
+ item = NotifyingPixmapItem(pm, on_change_cb=on_change)
819
+ item.setZValue(10 + len(self.layers))
820
+ item.setFlag(QGraphicsPixmapItem.ItemIsMovable, True)
821
+ item.setFlag(QGraphicsPixmapItem.ItemIsSelectable, False)
822
+ item.setTransformationMode(Qt.FastTransformation)
823
+ item.setShapeMode(QGraphicsPixmapItem.ShapeMode.MaskShape)
824
+ item.setTransformOriginPoint(QPointF(cx, cy))
825
+ self.scene.addItem(item); L.pixmap_item = item
826
+
827
+ qpath = QPainterPath(QPointF(pts[0,0], pts[0,1]))
828
+ for i in range(1, len(pts)): qpath.lineTo(QPointF(pts[i,0], pts[i,1]))
829
+ qpath.closeSubpath()
830
+ outline = QGraphicsPathItem(qpath, parent=item)
831
+ outline.setPen(QPen(L.color, 2))
832
+ outline.setZValue(item.zValue() + 1)
833
+ L.outline_item = outline
834
+ self._create_handles_for_layer(L)
835
+
836
+ origin_scene = item.mapToScene(item.transformOriginPoint())
837
+ L.keyframes.append(Keyframe(pos=np.array([origin_scene.x(), origin_scene.y()], dtype=np.float32),
838
+ rot_deg=float(item.rotation()),
839
+ scale=float(item.scale()) if item.scale()!=0 else 1.0,
840
+ hue_deg=0.0))
841
+
842
+ if not (L.is_external):
843
+ self._refresh_inpaint_preview()
844
+
845
+ if self.temp_path_item is not None: self._remove_if_in_scene(self.temp_path_item); self.temp_path_item = None
846
+ if self.first_click_marker is not None: self._remove_if_in_scene(self.first_click_marker); self.first_click_marker = None
847
+ self.temp_points = []
848
+ self.mode = Canvas.MODE_IDLE
849
+ # ensure the current hue preview is applied (neutral at first)
850
+ self._update_current_item_hue_preview()
851
+ return True
852
+
853
+ # -------- UI helpers --------
854
+ def _create_handles_for_layer(self, L: Layer):
855
+ if L.polygon_xy is None or L.pixmap_item is None:
856
+ return
857
+ x0, y0 = L.polygon_xy.min(axis=0)
858
+ x1, y1 = L.polygon_xy.max(axis=0)
859
+ corners = [QPointF(x0,y0), QPointF(x1,y0), QPointF(x1,y1), QPointF(x0,y1)]
860
+ top_center = QPointF((x0+x1)/2.0, y0)
861
+ rot_pos = QPointF(top_center.x(), top_center.y() - 24)
862
+
863
+ box_path = QPainterPath(corners[0])
864
+ for p in corners[1:]:
865
+ box_path.lineTo(p)
866
+ box_path.closeSubpath()
867
+ # bbox (dashed) around the polygon
868
+ bbox_item = QGraphicsPathItem(box_path, parent=L.pixmap_item)
869
+ pen = QPen(L.color, 1, Qt.DashLine)
870
+ pen.setCosmetic(True)
871
+ bbox_item.setPen(pen)
872
+ bbox_item.setZValue(L.pixmap_item.zValue() + 0.5)
873
+ L.handle_items.append(bbox_item)
874
+
875
+ for c in corners:
876
+ h = ScaleHandle(6, L.color, parent=L.pixmap_item)
877
+ h.setPos(c); h.set_item(L.pixmap_item)
878
+ L.handle_items.append(h)
879
+ rot_dot = RotateHandle(6, L.color, parent=L.pixmap_item)
880
+ rot_dot.setPos(rot_pos); rot_dot.set_item(L.pixmap_item)
881
+ L.handle_items.append(rot_dot)
882
+ tether = QGraphicsLineItem(QtCore.QLineF(top_center, rot_pos), L.pixmap_item)
883
+ pen_tether = QPen(L.color, 1)
884
+ pen_tether.setCosmetic(True)
885
+ tether.setPen(pen_tether)
886
+ tether.setZValue(L.pixmap_item.zValue() + 0.4)
887
+ L.handle_items.append(tether)
888
+
889
+ def _relayout_handles(self, L: Layer):
890
+ if L.polygon_xy is None or L.pixmap_item is None or not L.handle_items:
891
+ return
892
+ x0, y0 = L.polygon_xy.min(axis=0); x1, y1 = L.polygon_xy.max(axis=0)
893
+ corners = [QPointF(x0,y0), QPointF(x1,y0), QPointF(x1,y1), QPointF(x0,y1)]
894
+ top_center = QPointF((x0+x1)/2.0, y0)
895
+ rot_pos = QPointF(top_center.x(), top_center.y() - 24)
896
+ bbox_item = L.handle_items[0]
897
+ if isinstance(bbox_item, QGraphicsPathItem):
898
+ box_path = QPainterPath(corners[0])
899
+ for p in corners[1:]: box_path.lineTo(p)
900
+ box_path.closeSubpath(); bbox_item.setPath(box_path)
901
+ for i in range(4):
902
+ h = L.handle_items[1+i]
903
+ if isinstance(h, QGraphicsEllipseItem):
904
+ h.setPos(corners[i])
905
+ rot_dot = L.handle_items[5]
906
+ if isinstance(rot_dot, QGraphicsEllipseItem):
907
+ rot_dot.setPos(rot_pos)
908
+ tether = L.handle_items[6]
909
+ if isinstance(tether, QGraphicsLineItem):
910
+ tether.setLine(QtCore.QLineF(top_center, rot_pos))
911
+
912
+ def _ensure_preview_line(self, L: Layer):
913
+ if L.pixmap_item is None or not L.keyframes:
914
+ return
915
+ origin_scene = L.pixmap_item.mapToScene(L.pixmap_item.transformOriginPoint())
916
+ p0 = L.keyframes[-1].pos
917
+ p1 = np.array([origin_scene.x(), origin_scene.y()], dtype=np.float32)
918
+ if L.preview_line is None:
919
+ line = QGraphicsLineItem(p0[0], p0[1], p1[0], p1[1])
920
+ line.setPen(QPen(L.color, 1, Qt.DashLine))
921
+ line.setZValue(950)
922
+ self.scene.addItem(line)
923
+ L.preview_line = line
924
+ else:
925
+ L.preview_line.setLine(p0[0], p0[1], p1[0], p1[1])
926
+
927
+ def _update_temp_path_item(self, color: QColor):
928
+ if self.temp_path_item is None:
929
+ self.temp_path_item = QGraphicsPathItem()
930
+ pen = QPen(color, 2)
931
+ self.temp_path_item.setPen(pen)
932
+ self.temp_path_item.setZValue(1000)
933
+ self.scene.addItem(self.temp_path_item)
934
+ if not self.temp_points:
935
+ self.temp_path_item.setPath(QPainterPath())
936
+ return
937
+ path = QPainterPath(self.temp_points[0])
938
+ for p in self.temp_points[1:]:
939
+ path.lineTo(p)
940
+ path.lineTo(self.temp_points[0])
941
+ self.temp_path_item.setPath(path)
942
+
943
+ # -------------- Mouse / Keys for polygon drawing --------------
944
+ def mousePressEvent(self, event):
945
+ # Right-click = End Segment (ONLY when not drawing a polygon)
946
+ if self.mode != Canvas.MODE_DRAW_POLY and event.button() == Qt.RightButton:
947
+ self.end_segment_requested.emit()
948
+ event.accept()
949
+ return
950
+
951
+ if self.mode == Canvas.MODE_DRAW_POLY:
952
+ try:
953
+ p = event.position()
954
+ scene_pos = self.mapToScene(int(p.x()), int(p.y()))
955
+ except AttributeError:
956
+ scene_pos = self.mapToScene(event.pos())
957
+
958
+ if event.button() == Qt.LeftButton:
959
+ self.temp_points.append(scene_pos)
960
+ if len(self.temp_points) == 1:
961
+ if self.first_click_marker is not None:
962
+ self.scene.removeItem(self.first_click_marker)
963
+ self.first_click_marker = QGraphicsEllipseItem(-3, -3, 6, 6)
964
+ self.first_click_marker.setBrush(QColor(0, 220, 0))
965
+ self.first_click_marker.setPen(QPen(QColor(0, 0, 0), 1))
966
+ self.first_click_marker.setZValue(1200)
967
+ self.scene.addItem(self.first_click_marker)
968
+ self.first_click_marker.setPos(scene_pos)
969
+ color = self.current_layer.color if self.current_layer else QColor(255, 0, 0)
970
+ self._update_temp_path_item(color)
971
+ elif event.button() == Qt.RightButton:
972
+ # Finish polygon with right-click
973
+ preserve = (
974
+ self.current_layer is not None
975
+ and self.current_layer.pixmap_item is not None
976
+ and self.current_layer.is_external
977
+ )
978
+ ok = self.finish_polygon(preserve_motion=preserve)
979
+ self.polygon_finished.emit(ok)
980
+ if not ok:
981
+ QMessageBox.information(self, "Polygon", "Need at least 3 points.")
982
+ event.accept()
983
+ return
984
+
985
+ return
986
+
987
+ super().mousePressEvent(event)
988
+
989
+
990
+ def mouseDoubleClickEvent(self, event):
991
+ if self.mode == Canvas.MODE_DRAW_POLY:
992
+ return
993
+ super().mouseDoubleClickEvent(event)
994
+
995
+ def keyPressEvent(self, event: QtGui.QKeyEvent):
996
+ if self.mode == Canvas.MODE_DRAW_POLY:
997
+ if event.key() in (Qt.Key_Return, Qt.Key_Enter):
998
+ # Polygons are finished with right-click now; ignore Enter.
999
+ return
1000
+ elif event.key() == Qt.Key_Backspace:
1001
+ if self.temp_points:
1002
+ self.temp_points.pop()
1003
+ color = self.current_layer.color if self.current_layer else QColor(255,0,0)
1004
+ self._update_temp_path_item(color)
1005
+ return
1006
+ elif event.key() == Qt.Key_Escape:
1007
+ if self.temp_path_item is not None: self.scene.removeItem(self.temp_path_item); self.temp_path_item = None
1008
+ if self.first_click_marker is not None: self.scene.removeItem(self.first_click_marker); self.first_click_marker = None
1009
+ self.temp_points = []
1010
+ self.mode = Canvas.MODE_IDLE
1011
+ return
1012
+ super().keyPressEvent(event)
1013
+
1014
+ # keyframes
1015
+ def end_segment_add_keyframe(self):
1016
+ if not (self.current_layer and self.current_layer.pixmap_item and (self.current_layer.polygon_xy is not None) and self.current_layer.keyframes):
1017
+ return False
1018
+ item = self.current_layer.pixmap_item
1019
+ origin_scene = item.mapToScene(item.transformOriginPoint())
1020
+ kf = Keyframe(
1021
+ pos=np.array([origin_scene.x(), origin_scene.y()], dtype=np.float32),
1022
+ rot_deg=float(item.rotation()),
1023
+ scale=float(item.scale()) if item.scale()!=0 else 1.0,
1024
+ hue_deg=float(self.current_segment_hue_deg)
1025
+ )
1026
+ L = self.current_layer
1027
+ if len(L.keyframes) >= 1:
1028
+ p0 = L.keyframes[-1].pos; p1 = kf.pos
1029
+ if L.preview_line is not None:
1030
+ self.scene.removeItem(L.preview_line); L.preview_line = None
1031
+ line = QGraphicsLineItem(p0[0], p0[1], p1[0], p1[1])
1032
+ line.setPen(QPen(L.color, 2)); line.setZValue(900); self.scene.addItem(line)
1033
+ L.path_lines.append(line)
1034
+ L.keyframes.append(kf)
1035
+ self._ensure_preview_line(L)
1036
+ # reset hue for next leg
1037
+ self.current_segment_hue_deg = 0.0
1038
+ # refresh preview back to neutral
1039
+ self._update_current_item_hue_preview()
1040
+ return True
1041
+
1042
+ def has_pending_transform(self) -> bool:
1043
+ L = self.current_layer
1044
+ if not (L and L.pixmap_item and L.keyframes): return False
1045
+ last = L.keyframes[-1]
1046
+ item = L.pixmap_item
1047
+ origin_scene = item.mapToScene(item.transformOriginPoint())
1048
+ pos = np.array([origin_scene.x(), origin_scene.y()], dtype=np.float32)
1049
+ dpos = np.linalg.norm(pos - last.pos)
1050
+ drot = abs(float(item.rotation()) - last.rot_deg)
1051
+ dscale = abs((float(item.scale()) if item.scale()!=0 else 1.0) - last.scale)
1052
+ # hue preview doesn’t count as a “transform” until you end the segment
1053
+ return (dpos > 0.5) or (drot > 0.1) or (dscale > 1e-3)
1054
+
1055
+ def revert_to_last_keyframe(self, L: Optional[Layer] = None):
1056
+ if L is None: L = self.current_layer
1057
+ if not (L and L.pixmap_item and L.keyframes): return
1058
+ last = L.keyframes[-1]
1059
+ item = L.pixmap_item
1060
+ item.setRotation(last.rot_deg); item.setScale(last.scale)
1061
+ origin_scene = item.mapToScene(item.transformOriginPoint())
1062
+ d = QPointF(last.pos[0]-origin_scene.x(), last.pos[1]-origin_scene.y())
1063
+ item.setPos(item.pos() + d)
1064
+ self._ensure_preview_line(L)
1065
+ # restore hue preview to last keyframe hue
1066
+ self.current_segment_hue_deg = last.hue_deg
1067
+ self._update_current_item_hue_preview()
1068
+
1069
+ def _sample_keyframes_constant_speed_with_seg(self, keyframes: List[Keyframe], T: int):
1070
+ """
1071
+ Allocate frames to segments proportional to their Euclidean length so that
1072
+ translation happens at constant speed across the whole path.
1073
+ Returns (pos[T,2], scl[T], rot[T], seg_idx[T], t[T]).
1074
+ """
1075
+ K = len(keyframes)
1076
+ assert K >= 1
1077
+ import math
1078
+ if T <= 0:
1079
+ # degenerate: return just the first pose
1080
+ p0 = keyframes[0].pos.astype(np.float32)
1081
+ return (np.repeat(p0[None, :], 0, axis=0),
1082
+ np.zeros((0,), np.float32),
1083
+ np.zeros((0,), np.float32),
1084
+ np.zeros((0,), np.int32),
1085
+ np.zeros((0,), np.float32))
1086
+
1087
+ if K == 1:
1088
+ p0 = keyframes[0].pos.astype(np.float32)
1089
+ pos = np.repeat(p0[None, :], T, axis=0)
1090
+ scl = np.full((T,), float(keyframes[0].scale), dtype=np.float32)
1091
+ rot = np.full((T,), float(keyframes[0].rot_deg), dtype=np.float32)
1092
+ seg_idx = np.zeros((T,), dtype=np.int32)
1093
+ t = np.zeros((T,), dtype=np.float32)
1094
+ return pos, scl, rot, seg_idx, t
1095
+
1096
+ # Segment lengths (translation only)
1097
+ P = np.array([kf.pos for kf in keyframes], dtype=np.float32) # [K,2]
1098
+ seg_vec = P[1:] - P[:-1] # [K-1,2]
1099
+ lengths = np.linalg.norm(seg_vec, axis=1) # [K-1]
1100
+ total_len = float(lengths.sum())
1101
+
1102
+ def _per_seg_counts_uniform():
1103
+ # fallback: equal frames per segment
1104
+ base = np.zeros((K-1,), dtype=np.int32)
1105
+ if T > 0:
1106
+ # spread as evenly as possible
1107
+ q, r = divmod(T, K-1)
1108
+ base[:] = q
1109
+ base[:r] += 1
1110
+ return base
1111
+
1112
+ if total_len <= 1e-6:
1113
+ counts = _per_seg_counts_uniform()
1114
+ else:
1115
+ # Proportional allocation by length, rounded with largest-remainder
1116
+ raw = (lengths / total_len) * T
1117
+ base = np.floor(raw).astype(np.int32)
1118
+ remainder = T - int(base.sum())
1119
+ if remainder > 0:
1120
+ order = np.argsort(-(raw - base)) # largest fractional parts first
1121
+ base[order[:remainder]] += 1
1122
+ counts = base # may contain zeros for ~zero-length segments
1123
+
1124
+ # Build arrays
1125
+ pos_list, scl_list, rot_list, seg_idx_list, t_list = [], [], [], [], []
1126
+
1127
+ for s in range(K - 1):
1128
+ n = int(counts[s])
1129
+ if n <= 0:
1130
+ continue
1131
+ # Local times in [0,1) to avoid s+1 overflow in hue blending
1132
+ ts = np.linspace(0.0, 1.0, n, endpoint=False, dtype=np.float32)
1133
+
1134
+ p0, p1 = P[s], P[s + 1]
1135
+ s0, s1 = max(1e-6, float(keyframes[s].scale)), max(1e-6, float(keyframes[s + 1].scale))
1136
+ r0, r1 = float(keyframes[s].rot_deg), float(keyframes[s + 1].rot_deg)
1137
+
1138
+ # Interpolate
1139
+ pos_seg = (1 - ts)[:, None] * p0[None, :] + ts[:, None] * p1[None, :]
1140
+ scl_seg = np.exp((1 - ts) * math.log(s0) + ts * math.log(s1))
1141
+ rot_seg = (1 - ts) * r0 + ts * r1
1142
+
1143
+ pos_list.append(pos_seg.astype(np.float32))
1144
+ scl_list.append(scl_seg.astype(np.float32))
1145
+ rot_list.append(rot_seg.astype(np.float32))
1146
+ seg_idx_list.append(np.full((n,), s, dtype=np.int32))
1147
+ t_list.append(ts.astype(np.float32))
1148
+
1149
+ # If counts summed to < T (can happen if T < #segments), pad with final pose of last used seg
1150
+ N = sum(int(c) for c in counts)
1151
+ if N < T:
1152
+ # use final keyframe as hold
1153
+ p_end = P[-1].astype(np.float32)
1154
+ extra = T - N
1155
+ pos_list.append(np.repeat(p_end[None, :], extra, axis=0))
1156
+ scl_list.append(np.full((extra,), float(keyframes[-1].scale), dtype=np.float32))
1157
+ rot_list.append(np.full((extra,), float(keyframes[-1].rot_deg), dtype=np.float32))
1158
+ # Use the last real segment index (K-2), with t=0 (blend start of final seg)
1159
+ seg_idx_list.append(np.full((extra,), max(0, K - 2), dtype=np.int32))
1160
+ t_list.append(np.zeros((extra,), dtype=np.float32))
1161
+
1162
+ pos = np.vstack(pos_list) if pos_list else np.zeros((T, 2), dtype=np.float32)
1163
+ scl = np.concatenate(scl_list) if scl_list else np.zeros((T,), dtype=np.float32)
1164
+ rot = np.concatenate(rot_list) if rot_list else np.zeros((T,), dtype=np.float32)
1165
+ seg_idx = np.concatenate(seg_idx_list) if seg_idx_list else np.zeros((T,), dtype=np.int32)
1166
+ t = np.concatenate(t_list) if t_list else np.zeros((T,), dtype=np.float32)
1167
+
1168
+ # Truncate in case of rounding over-alloc (very rare), or pad if still short
1169
+ if len(pos) > T:
1170
+ pos, scl, rot, seg_idx, t = pos[:T], scl[:T], rot[:T], seg_idx[:T], t[:T]
1171
+ elif len(pos) < T:
1172
+ pad = T - len(pos)
1173
+ pos = np.vstack([pos, np.repeat(pos[-1:,:], pad, axis=0)])
1174
+ scl = np.concatenate([scl, np.repeat(scl[-1:], pad)])
1175
+ rot = np.concatenate([rot, np.repeat(rot[-1:], pad)])
1176
+ seg_idx = np.concatenate([seg_idx, np.repeat(seg_idx[-1:], pad)])
1177
+ t = np.concatenate([t, np.repeat(t[-1:], pad)])
1178
+
1179
+ return pos.astype(np.float32), scl.astype(np.float32), rot.astype(np.float32), seg_idx.astype(np.int32), t.astype(np.float32)
1180
+
1181
+
1182
+ def undo(self) -> bool:
1183
+ if self.mode == Canvas.MODE_DRAW_POLY and self.temp_points:
1184
+ self.temp_points.pop()
1185
+ color = self.current_layer.color if self.current_layer else QColor(255,0,0)
1186
+ self._update_temp_path_item(color)
1187
+ return True
1188
+ if self.has_pending_transform():
1189
+ self.revert_to_last_keyframe()
1190
+ return True
1191
+ if self.current_layer and len(self.current_layer.keyframes) > 1:
1192
+ L = self.current_layer
1193
+ if L.path_lines:
1194
+ line = L.path_lines.pop(); self.scene.removeItem(line)
1195
+ L.keyframes.pop()
1196
+ self.revert_to_last_keyframe(L)
1197
+ return True
1198
+ if self.current_layer:
1199
+ L = self.current_layer
1200
+ if (L.pixmap_item is not None) and (len(L.keyframes) <= 1) and (len(L.path_lines) == 0):
1201
+ if L.preview_line is not None:
1202
+ self.scene.removeItem(L.preview_line); L.preview_line = None
1203
+ if L.outline_item is not None:
1204
+ self.scene.removeItem(L.outline_item); L.outline_item = None
1205
+ for it in L.handle_items:
1206
+ self.scene.removeItem(it)
1207
+ L.handle_items.clear()
1208
+ self.scene.removeItem(L.pixmap_item); L.pixmap_item = None
1209
+ try:
1210
+ idx = self.layers.index(L)
1211
+ self.layers.pop(idx)
1212
+ except ValueError:
1213
+ pass
1214
+ self.current_layer = self.layers[-1] if self.layers else None
1215
+ if (L.is_external is False):
1216
+ self._refresh_inpaint_preview()
1217
+ return True
1218
+ return False
1219
+
1220
+ # -------- Demo playback (with hue crossfade) --------
1221
+ def build_preview_frames(self, T_total: int) -> Optional[List[np.ndarray]]:
1222
+ if self.base_bgr is None:
1223
+ return None
1224
+ H, W = self.base_bgr.shape[:2]
1225
+ total_mask = np.zeros((H, W), dtype=bool)
1226
+ for L in self.layers:
1227
+ if not L.has_polygon() or L.is_external:
1228
+ continue
1229
+ poly0 = L.polygon_xy.astype(np.int32)
1230
+ mask = np.zeros((H, W), dtype=np.uint8); cv2.fillPoly(mask, [poly0], 255)
1231
+ total_mask |= (mask > 0)
1232
+ background = inpaint_background(self.base_bgr, total_mask)
1233
+
1234
+ all_layer_frames = []
1235
+ has_any = False
1236
+
1237
+ # def sample_keyframes_uniform_with_seg(keyframes: List[Keyframe], T: int):
1238
+ # K = len(keyframes); assert K >= 1
1239
+ # if K == 1:
1240
+ # pos = np.repeat(keyframes[0].pos[None, :], T, axis=0).astype(np.float32)
1241
+ # scl = np.full((T,), keyframes[0].scale, dtype=np.float32)
1242
+ # rot = np.full((T,), keyframes[0].rot_deg, dtype=np.float32)
1243
+ # seg_idx = np.zeros((T,), dtype=np.int32)
1244
+ # t = np.zeros((T,), dtype=np.float32)
1245
+ # return pos, scl, rot, seg_idx, t
1246
+ # segs = K - 1
1247
+ # u = np.linspace(0.0, float(segs), T, dtype=np.float32)
1248
+ # seg_idx = np.minimum(np.floor(u).astype(int), segs - 1)
1249
+ # t = u - seg_idx
1250
+ # k0 = np.array([[keyframes[i].pos[0], keyframes[i].pos[1], keyframes[i].scale, keyframes[i].rot_deg] for i in seg_idx], dtype=np.float32)
1251
+ # k1 = np.array([[keyframes[i+1].pos[0], keyframes[i+1].pos[1], keyframes[i+1].scale, keyframes[i+1].rot_deg] for i in seg_idx], dtype=np.float32)
1252
+ # pos0 = k0[:, :2]; pos1 = k1[:, :2]
1253
+ # s0 = np.maximum(1e-6, k0[:, 2]); s1 = np.maximum(1e-6, k1[:, 2])
1254
+ # r0 = k0[:, 3]; r1 = k1[:, 3]
1255
+ # pos = (1 - t)[:, None] * pos0 + t[:, None] * pos1
1256
+ # scl = np.exp((1 - t) * np.log(s0) + t * np.log(s1))
1257
+ # rot = (1 - t) * r0 + t * r1
1258
+ # return pos.astype(np.float32), scl.astype(np.float32), rot.astype(np.float32), seg_idx, t
1259
+
1260
+ for L in self.layers:
1261
+ if not L.has_polygon() or len(L.keyframes) < 2:
1262
+ continue
1263
+ has_any = True
1264
+
1265
+ # path_xy, scales, rots, seg_idx, t = sample_keyframes_uniform_with_seg(L.keyframes, T_total)
1266
+ path_xy, scales, rots, seg_idx, t = self._sample_keyframes_constant_speed_with_seg(L.keyframes, T_total)
1267
+
1268
+ origin_xy = L.origin_local_xy if L.origin_local_xy is not None else L.polygon_xy.mean(axis=0)
1269
+
1270
+ # Precompute animations for each keyframe’s hue (crossfade per segment)
1271
+ K = len(L.keyframes)
1272
+ hue_values = [L.keyframes[k].hue_deg for k in range(K)]
1273
+ hue_to_frames: Dict[int, List[np.ndarray]] = {}
1274
+ for k in range(K):
1275
+ bgr_h = apply_hue_shift_bgr(L.source_bgr, hue_values[k])
1276
+ frames_h, _ = animate_polygon(bgr_h, L.polygon_xy, path_xy, scales, rots,
1277
+ interp=cv2.INTER_LINEAR, origin_xy=origin_xy)
1278
+ hue_to_frames[k] = frames_h
1279
+
1280
+ # Mix per frame using seg_idx / t
1281
+ frames_rgba = []
1282
+ for i in range(T_total):
1283
+ s = int(seg_idx[i])
1284
+ w = float(t[i])
1285
+ A = hue_to_frames[s][i].astype(np.float32)
1286
+ B = hue_to_frames[s+1][i].astype(np.float32)
1287
+ mix = (1.0 - w) * A + w * B
1288
+ frames_rgba.append(np.clip(mix, 0, 255).astype(np.uint8))
1289
+ all_layer_frames.append(frames_rgba)
1290
+
1291
+ if not has_any:
1292
+ return None
1293
+ frames_out = composite_frames(background, all_layer_frames)
1294
+ return frames_out
1295
+
1296
+ def play_demo(self, fps: int, T_total: int):
1297
+ frames = self.build_preview_frames(T_total)
1298
+ if not frames:
1299
+ QMessageBox.information(self, "Play Demo", "Nothing to play yet. Add a polygon and keyframes.")
1300
+ return
1301
+ self.play_frames = frames
1302
+ self.play_index = 0
1303
+ if self.player_item is None:
1304
+ self.player_item = QGraphicsPixmapItem()
1305
+ self.player_item.setZValue(5000)
1306
+ self.scene.addItem(self.player_item)
1307
+ self.player_item.setVisible(True)
1308
+ self._on_play_tick()
1309
+ interval_ms = max(1, int(1000 / max(1, fps)))
1310
+ self.play_timer.start(interval_ms)
1311
+
1312
+ def _on_play_tick(self):
1313
+ if not self.play_frames or self.play_index >= len(self.play_frames):
1314
+ self.play_timer.stop()
1315
+ if self.player_item is not None:
1316
+ self.player_item.setVisible(False)
1317
+ return
1318
+ frame = self.play_frames[self.play_index]
1319
+ self.play_index += 1
1320
+ self.player_item.setPixmap(np_bgr_to_qpixmap(frame))
1321
+
1322
+ # ------------------------------
1323
+ # Main window / controls
1324
+ # ------------------------------
1325
+
1326
+ class MainWindow(QMainWindow):
1327
+ def __init__(self):
1328
+ super().__init__()
1329
+ self.setWindowTitle("Time-to-Move: Cut & Drag")
1330
+ self.resize(1180, 840)
1331
+
1332
+ self.canvas = Canvas(self)
1333
+ self.canvas.polygon_finished.connect(self._on_canvas_polygon_finished)
1334
+ self.canvas.end_segment_requested.connect(self.on_end_segment)
1335
+
1336
+ # -------- Instruction banner above canvas (CENTERED) --------
1337
+ self.instruction_label = QLabel()
1338
+ self.instruction_label.setWordWrap(True)
1339
+ self.instruction_label.setAlignment(Qt.AlignHCenter | Qt.AlignVCenter)
1340
+ self.instruction_label.setStyleSheet("""
1341
+ QLabel {
1342
+ background: #f7f7fa;
1343
+ border-bottom: 1px solid #ddd;
1344
+ padding: 10px 12px;
1345
+ font-size: 15px;
1346
+ color: #222;
1347
+ }
1348
+ """)
1349
+ self._set_instruction("Welcome! • Select Image to begin.")
1350
+
1351
+ central = QWidget()
1352
+ v = QVBoxLayout(central)
1353
+ v.setContentsMargins(0,0,0,0); v.setSpacing(0)
1354
+ v.addWidget(self.instruction_label)
1355
+ v.addWidget(self.canvas)
1356
+ self.setCentralWidget(central)
1357
+
1358
+ # state: external placing mode?
1359
+ self.placing_external: bool = False
1360
+ self.placing_layer: Optional[Layer] = None
1361
+
1362
+ # -------- Vertical toolbar on the LEFT --------
1363
+ tb = QToolBar("Tools")
1364
+ self.addToolBar(Qt.LeftToolBarArea, tb)
1365
+ tb.setOrientation(Qt.Vertical)
1366
+
1367
+ def add_btn(text: str, slot, icon: Optional[QIcon] = None):
1368
+ btn = QPushButton(text)
1369
+ if icon: btn.setIcon(icon)
1370
+ btn.setCursor(Qt.PointingHandCursor)
1371
+ btn.setMinimumWidth(180)
1372
+ btn.clicked.connect(slot)
1373
+ tb.addWidget(btn); return btn
1374
+
1375
+ # Fit dropdown (default: Center Crop)
1376
+ self.cmb_fit = QComboBox(); self.cmb_fit.addItems(["Center Crop", "Center Pad"])
1377
+ tb.addWidget(self.cmb_fit)
1378
+ self.canvas.fit_mode_combo = self.cmb_fit
1379
+
1380
+ # Dotted separator
1381
+ line_sep = QFrame(); line_sep.setFrameShape(QFrame.HLine); line_sep.setFrameShadow(QFrame.Plain)
1382
+ line_sep.setStyleSheet("color: #888; border-top: 1px dotted #888; margin: 8px 0;")
1383
+ tb.addWidget(line_sep)
1384
+
1385
+ # Select Image
1386
+ self.btn_select = add_btn("🖼️ Select Image", self.on_select_base)
1387
+
1388
+ # Add Polygon (toggles to Finish)
1389
+ self.pent_icon = self.canvas.make_pentagon_icon()
1390
+ self.btn_add_poly = add_btn("Add Polygon", self.on_add_polygon_toggled, icon=self.pent_icon)
1391
+ self.add_poly_active = False
1392
+
1393
+ # Add External (two-step: file → Place)
1394
+ self.btn_add_external = add_btn("🖼️➕ Add External Image", self.on_add_or_place_external)
1395
+
1396
+ # HUE TRANSFORM (slider + Default) ABOVE End Segment
1397
+ tb.addSeparator()
1398
+ tb.addWidget(QLabel("Hue Transform"))
1399
+ hue_row = QWidget(); row = QHBoxLayout(hue_row); row.setContentsMargins(0,0,0,0)
1400
+ self.sld_hue = QSlider(Qt.Horizontal); self.sld_hue.setRange(-180, 180); self.sld_hue.setValue(0)
1401
+ btn_default = QPushButton("Default"); btn_default.setCursor(Qt.PointingHandCursor); btn_default.setFixedWidth(70)
1402
+ row.addWidget(self.sld_hue, 1); row.addWidget(btn_default, 0)
1403
+ tb.addWidget(hue_row)
1404
+ self.sld_hue.valueChanged.connect(self.on_hue_changed)
1405
+ btn_default.clicked.connect(lambda: self.sld_hue.setValue(0))
1406
+
1407
+ # End Segment and Undo
1408
+ self.btn_end_seg = add_btn("🎯 End Segment", self.on_end_segment)
1409
+ self.btn_undo = add_btn("↩️ Undo", self.on_undo)
1410
+
1411
+ tb.addSeparator()
1412
+ tb.addWidget(QLabel("Total Frames:"))
1413
+ self.spn_total_frames = QSpinBox(); self.spn_total_frames.setRange(1, 2000); self.spn_total_frames.setValue(81)
1414
+ tb.addWidget(self.spn_total_frames)
1415
+ tb.addWidget(QLabel("FPS:"))
1416
+ self.spn_fps = QSpinBox(); self.spn_fps.setRange(1, 120); self.spn_fps.setValue(16)
1417
+ tb.addWidget(self.spn_fps)
1418
+
1419
+ tb.addSeparator()
1420
+ self.btn_play = add_btn("▶️ Play Demo", self.on_play_demo)
1421
+ tb.addWidget(QLabel("Prompt"))
1422
+ self.txt_prompt = QPlainTextEdit()
1423
+ self.txt_prompt.setFixedHeight(80) # ~3–5 lines tall; tweak if you like
1424
+ self.txt_prompt.setMinimumWidth(180) # matches your button width
1425
+ # (Optional) If your PySide6 supports it, you can uncomment the next line:
1426
+ # self.txt_prompt.setPlaceholderText("Write your prompt here…")
1427
+ tb.addWidget(self.txt_prompt)
1428
+ self.btn_save = add_btn("💾 Save", self.on_save)
1429
+ self.btn_new = add_btn("🆕 New", self.on_new)
1430
+ self.btn_exit = add_btn("⏹️ Exit", self.close)
1431
+
1432
+ # Status strip at bottom
1433
+ status = QToolBar("Status")
1434
+ self.addToolBar(Qt.BottomToolBarArea, status)
1435
+ self.status_label = QLabel("Ready")
1436
+ status.addWidget(self.status_label)
1437
+
1438
+ # ---------- Instruction helper ----------
1439
+ def _set_instruction(self, text: str):
1440
+ self.instruction_label.setText(text)
1441
+
1442
+ # ---------- Pending-segment guards ----------
1443
+ def _block_if_pending_segment(self, action_label: str) -> bool:
1444
+ if self.canvas.current_layer and self.canvas.has_pending_transform():
1445
+ QMessageBox.information(
1446
+ self, "Finish Segment",
1447
+ f"Please end the current segment (click '🎯 End Segment') before {action_label}."
1448
+ )
1449
+ self._set_instruction("Finish current segment: drag/scale/rotate as needed, adjust Hue, then click 🎯 End Segment.")
1450
+ return True
1451
+ return False
1452
+
1453
+ # ------------- Actions -------------
1454
+ def on_select_base(self):
1455
+ if self._block_if_pending_segment("changing the base image"):
1456
+ return
1457
+ path, _ = QFileDialog.getOpenFileName(self, "Select image", "", "Images/Videos (*.png *.jpg *.jpeg *.bmp *.mp4 *.mov *.avi *.mkv)")
1458
+ if not path:
1459
+ self._set_instruction("No image selected. Click ‘Select Image’ to begin.")
1460
+ return
1461
+ try:
1462
+ raw = load_first_frame(path)
1463
+ except Exception as e:
1464
+ QMessageBox.critical(self, "Load", f"Failed to load: {e}")
1465
+ return
1466
+ self.canvas.set_base_image(raw)
1467
+ self.add_poly_active = False
1468
+ self.btn_add_poly.setText("Add Polygon")
1469
+ self.placing_external = False; self.placing_layer = None
1470
+ self.btn_add_external.setText("🖼️➕ Add External Image")
1471
+ self.status_label.setText("Base loaded.")
1472
+ self._set_instruction("Step 1: Add a polygon (Add Polygon), or add an external sprite (Add External Image).")
1473
+
1474
+ def on_add_polygon_toggled(self):
1475
+ if self.placing_external:
1476
+ QMessageBox.information(self, "Place External First",
1477
+ "Please place the external image first (click ‘✅ Place External Image’).")
1478
+ self._set_instruction("Place External: drag/scale/rotate to choose starting pose, then click ‘✅ Place External Image’.")
1479
+ return
1480
+
1481
+ if (not self.add_poly_active) and self._block_if_pending_segment("adding a polygon"):
1482
+ return
1483
+
1484
+ if not self.add_poly_active:
1485
+ if self.canvas.base_bgr is None:
1486
+ QMessageBox.information(self, "Add Polygon", "Please select an image first.")
1487
+ self._set_instruction("Click ‘Select Image’ to begin.")
1488
+ return
1489
+
1490
+ # --- KEY CHANGE ---
1491
+ # If there's no current layer OR the current layer is BASE -> make a NEW BASE layer (new color).
1492
+ # If the current layer is EXTERNAL -> split/cut that external (preserve motion).
1493
+ if (self.canvas.current_layer is None) or (not self.canvas.current_layer.is_external):
1494
+ # New polygon on the base image => new layer with a fresh color
1495
+ self.canvas.new_layer_from_source(
1496
+ name=f"Layer {len(self.canvas.layers)+1}",
1497
+ source_bgr=self.canvas.base_bgr,
1498
+ is_external=False
1499
+ )
1500
+ else:
1501
+ # Current layer is external: go into "draw polygon to cut external" mode
1502
+ self.canvas.start_draw_polygon(preserve_motion=True)
1503
+ # --- END KEY CHANGE ---
1504
+
1505
+ self.add_poly_active = True
1506
+ self.btn_add_poly.setText("✅ Finish Polygon Selection")
1507
+ self.status_label.setText("Drawing polygon…")
1508
+ self._set_instruction("Polygon mode: Left-click to add points. Backspace = undo point. Right-click = finish. Esc = cancel.")
1509
+ else:
1510
+ # Finish current polygon selection
1511
+ preserve = (self.canvas.current_layer is not None and
1512
+ self.canvas.current_layer.pixmap_item is not None and
1513
+ self.canvas.current_layer.is_external)
1514
+ ok = self.canvas.finish_polygon(preserve_motion=preserve)
1515
+ if not ok:
1516
+ QMessageBox.information(self, "Polygon", "Need at least 3 points (keep adding).")
1517
+ self._set_instruction("Keep adding polygon points (≥3). Right-click to finish.")
1518
+ return
1519
+ self.add_poly_active = False
1520
+ self.btn_add_poly.setText("Add Polygon")
1521
+ self.status_label.setText("Polygon ready.")
1522
+ self._set_instruction(
1523
+ "Drag to move, use corner circles to scale, top dot to rotate. "
1524
+ "Adjust Hue if you like, then click ‘🎯 End Segment’ or Right Click to record a move."
1525
+ )
1526
+ def on_add_or_place_external(self):
1527
+ # If we're already in "placing" mode, finalize the initial keyframe.
1528
+ if self.placing_external and self.placing_layer is not None:
1529
+ try:
1530
+ # Lock the initial pose as keyframe #1
1531
+ self.canvas.place_external_initial_keyframe(self.placing_layer)
1532
+ # Make sure this layer stays selected
1533
+ self.canvas.current_layer = self.placing_layer
1534
+ # Draw the dashed preview line if relevant
1535
+ self.canvas._ensure_preview_line(self.placing_layer)
1536
+ finally:
1537
+ self.placing_external = False
1538
+ self.placing_layer = None
1539
+ self.btn_add_external.setText("🖼️➕ Add External Image")
1540
+ self.status_label.setText("External starting pose locked.")
1541
+ self._set_instruction("Now drag/scale/rotate and click ‘🎯 End Segment’ to record movement.")
1542
+ return
1543
+
1544
+ # Otherwise, begin adding a new external image.
1545
+ if self._block_if_pending_segment("adding an external image"):
1546
+ return
1547
+ if self.canvas.base_bgr is None:
1548
+ QMessageBox.information(self, "External", "Select a base image first.")
1549
+ self._set_instruction("Click ‘Select Image’ to begin.")
1550
+ return
1551
+
1552
+ path, _ = QFileDialog.getOpenFileName(
1553
+ self, "Select external image", "",
1554
+ "Images/Videos (*.png *.jpg *.jpeg *.bmp *.mp4 *.mov *.avi *.mkv)"
1555
+ )
1556
+ if not path:
1557
+ self._set_instruction("External not chosen. You can Add External Image later.")
1558
+ return
1559
+
1560
+ try:
1561
+ raw = load_first_frame(path)
1562
+ except Exception as e:
1563
+ QMessageBox.critical(self, "Load", f"Failed to load external: {e}")
1564
+ return
1565
+
1566
+ L = self.canvas.add_external_sprite_layer(raw) # no keyframe yet
1567
+ if L is None:
1568
+ QMessageBox.critical(self, "External", "Failed to create external layer.")
1569
+ return
1570
+
1571
+ self.placing_external = True
1572
+ self.placing_layer = L
1573
+ self.canvas.current_layer = L # keep selection consistent
1574
+ self.btn_add_external.setText("✅ Place External Image")
1575
+ self.status_label.setText("Place external image.")
1576
+ self._set_instruction("Place External: drag into view, scale with corner circles, rotate with top dot. Then click ‘✅ Place External Image’.")
1577
+
1578
+
1579
+ def _on_canvas_polygon_finished(self, ok: bool):
1580
+ if ok:
1581
+ self.add_poly_active = False
1582
+ self.btn_add_poly.setText("Add Polygon")
1583
+ self.status_label.setText("Polygon ready.")
1584
+ self._set_instruction(
1585
+ "Drag to move, use corner circles to scale, top dot to rotate. "
1586
+ "Adjust Hue if you like, then click ‘🎯 End Segment’ or Right Click to record a move."
1587
+ )
1588
+ else:
1589
+ # keep your existing “need ≥3 points” behavior; nothing else to do here
1590
+ pass
1591
+
1592
+ def on_hue_changed(self, val: int):
1593
+ self.canvas.current_segment_hue_deg = float(val)
1594
+ self.canvas._update_current_item_hue_preview()
1595
+
1596
+ def on_end_segment(self):
1597
+ if self.placing_external:
1598
+ QMessageBox.information(self, "Place External First",
1599
+ "Please place the external image first (click ‘✅ Place External Image’).")
1600
+ self._set_instruction("Place External: drag/scale/rotate to choose starting pose, then click ‘✅ Place External Image’.")
1601
+ return
1602
+ ok = self.canvas.end_segment_add_keyframe()
1603
+ if ok:
1604
+ n = len(self.canvas.current_layer.keyframes) if self.canvas.current_layer else 0
1605
+ self.status_label.setText(f"Keyframe #{n} added.")
1606
+ self._set_instruction(
1607
+ "Segment added! Move again for the next leg, adjust Hue if you like, "
1608
+ "then click ‘🎯 End Segment’ or Right Click to record a move."
1609
+ )
1610
+ else:
1611
+ QMessageBox.information(self, "End Segment", "Nothing to record yet. Add/finish a polygon or add/place an external sprite first.")
1612
+ self._set_instruction("Add a polygon (base/external) or place an external image, then drag and click ‘🎯 End Segment’.")
1613
+
1614
+ def on_undo(self):
1615
+ if self.placing_external and self.placing_layer is not None:
1616
+ L = self.placing_layer
1617
+ if L.pixmap_item is not None: self.canvas.scene.removeItem(L.pixmap_item)
1618
+ if L.outline_item is not None: self.canvas.scene.removeItem(L.outline_item)
1619
+ for it in L.handle_items: self.canvas.scene.removeItem(it)
1620
+ try:
1621
+ idx = self.canvas.layers.index(L)
1622
+ self.canvas.layers.pop(idx)
1623
+ except ValueError:
1624
+ pass
1625
+ self.canvas.current_layer = self.canvas.layers[-1] if self.canvas.layers else None
1626
+ self.placing_layer = None
1627
+ self.placing_external = False
1628
+ self.btn_add_external.setText("🖼️➕ Add External Image")
1629
+ self.status_label.setText("External placement canceled.")
1630
+ self._set_instruction("External placement canceled. Add External Image again or continue editing.")
1631
+ return
1632
+
1633
+ if self.canvas.undo():
1634
+ self.status_label.setText("Undo applied.")
1635
+ self._set_instruction("Undone. Continue editing, or click ‘🎯 End Segment’ to record movement.")
1636
+ else:
1637
+ self.status_label.setText("Nothing to undo.")
1638
+ self._set_instruction("Nothing to undo. Drag/scale/rotate and click ‘🎯 End Segment’, or add new shapes.")
1639
+
1640
+ def _sample_keyframes_uniform(self, keyframes: List[Keyframe], T: int):
1641
+ K = len(keyframes); assert K >= 2
1642
+ segs = K - 1
1643
+ u = np.linspace(0.0, float(segs), T, dtype=np.float32)
1644
+ seg_idx = np.minimum(np.floor(u).astype(int), segs - 1)
1645
+ t = u - seg_idx
1646
+ k0 = np.array([[keyframes[i].pos[0], keyframes[i].pos[1], keyframes[i].scale, keyframes[i].rot_deg] for i in seg_idx], dtype=np.float32)
1647
+ k1 = np.array([[keyframes[i+1].pos[0], keyframes[i+1].pos[1], keyframes[i+1].scale, keyframes[i+1].rot_deg] for i in seg_idx], dtype=np.float32)
1648
+ pos0 = k0[:, :2]; pos1 = k1[:, :2]
1649
+ s0 = np.maximum(1e-6, k0[:, 2]); s1 = np.maximum(1e-6, k1[:, 2])
1650
+ r0 = k0[:, 3]; r1 = k1[:, 3]
1651
+ pos = (1 - t)[:, None] * pos0 + t[:, None] * pos1
1652
+ scl = np.exp((1 - t) * np.log(s0) + t * np.log(s1))
1653
+ rot = (1 - t) * r0 + t * r1
1654
+ return pos.astype(np.float32), scl.astype(np.float32), rot.astype(np.float32)
1655
+
1656
+ def on_play_demo(self):
1657
+ if self.canvas.base_bgr is None:
1658
+ QMessageBox.information(self, "Play Demo", "Select an image first.")
1659
+ self._set_instruction("Click ‘Select Image’ to begin.")
1660
+ return
1661
+ has_segments = any((L.polygon_xy is not None and len(L.keyframes) >= 2) for L in self.canvas.layers)
1662
+ if not has_segments:
1663
+ QMessageBox.information(self, "Play Demo", "No motion segments yet. Drag something and click ‘🎯 End Segment’ at least once.")
1664
+ self._set_instruction("Create at least one movement: drag/scale/rotate then click ‘🎯 End Segment’.")
1665
+ return
1666
+ fps = int(self.spn_fps.value())
1667
+ T_total = int(self.spn_total_frames.value())
1668
+ self.canvas.play_demo(fps=fps, T_total=T_total)
1669
+ self._set_instruction("Playing demo… When it ends, you’ll return to the editor. Tweak and play again, or 💾 Save.")
1670
+
1671
+ def on_new(self):
1672
+ if self._block_if_pending_segment("starting a new project"):
1673
+ return
1674
+ self.canvas.scene.clear()
1675
+ self.canvas.layers.clear()
1676
+ self.canvas.current_layer = None
1677
+ self.canvas.base_bgr = None
1678
+ self.canvas.base_preview_bgr = None
1679
+ self.canvas.base_item = None
1680
+ self.add_poly_active = False
1681
+ self.btn_add_poly.setText("Add Polygon")
1682
+ self.placing_external = False; self.placing_layer = None
1683
+ self.btn_add_external.setText("🖼️➕ Add External Image")
1684
+ if hasattr(self, "txt_prompt"):
1685
+ self.txt_prompt.clear()
1686
+ self.status_label.setText("Ready")
1687
+ self._set_instruction("New project. Click ‘Select Image’ to begin.")
1688
+ self.on_select_base()
1689
+
1690
+ def on_save(self):
1691
+ if self._block_if_pending_segment("saving"):
1692
+ return
1693
+ if self.canvas.base_bgr is None or not self.canvas.layers:
1694
+ QMessageBox.information(self, "Save", "Load an image and add at least one polygon/sprite first.")
1695
+ self._set_instruction("Add a polygon (base/external), record segments (🎯 End Segment), then Save.")
1696
+ return
1697
+
1698
+ # If any layer has exactly one keyframe, auto-add the current pose as a second keyframe
1699
+ for L in self.canvas.layers:
1700
+ if L.pixmap_item and L.polygon_xy is not None and len(L.keyframes) == 1:
1701
+ self.canvas.current_layer = L
1702
+ self.canvas.end_segment_add_keyframe()
1703
+
1704
+ # 1) Pick a parent directory
1705
+ base_dir = QtWidgets.QFileDialog.getExistingDirectory(
1706
+ self, "Select output directory", ""
1707
+ )
1708
+ if not base_dir:
1709
+ self._set_instruction("Save canceled. You can keep editing or try ▶️ Play Demo.")
1710
+ return
1711
+
1712
+ # 2) Ask for a subdirectory name
1713
+ subdir_name, ok = QtWidgets.QInputDialog.getText(
1714
+ self, "Subfolder Name", "Create a new subfolder in the selected directory:"
1715
+ )
1716
+ if not ok or not subdir_name.strip():
1717
+ self._set_instruction("Save canceled (no subfolder name).")
1718
+ return
1719
+ subdir_name = subdir_name.strip()
1720
+
1721
+ final_dir = os.path.join(base_dir, subdir_name)
1722
+ if os.path.exists(final_dir):
1723
+ resp = QMessageBox.question(
1724
+ self, "Folder exists",
1725
+ f"'{subdir_name}' already exists in the selected directory.\n"
1726
+ f"Use it and overwrite files?",
1727
+ QMessageBox.Yes | QMessageBox.No, QMessageBox.No
1728
+ )
1729
+ if resp != QMessageBox.Yes:
1730
+ self._set_instruction("Save canceled. Choose another name or directory next time.")
1731
+ return
1732
+ else:
1733
+ try:
1734
+ os.makedirs(final_dir, exist_ok=True)
1735
+ except Exception as e:
1736
+ QMessageBox.critical(self, "Save", f"Failed to create folder:\n{e}")
1737
+ return
1738
+
1739
+ try:
1740
+ prompt_text = self.txt_prompt.toPlainText()
1741
+ except Exception:
1742
+ prompt_text = ""
1743
+ try:
1744
+ with open(os.path.join(final_dir, "prompt.txt"), "w", encoding="utf-8") as f:
1745
+ f.write(prompt_text)
1746
+ except Exception as e:
1747
+ # Non-fatal: continue saving the rest if prompt write fails
1748
+ print(f"[warn] Failed to write prompt.txt: {e}")
1749
+
1750
+ # Output paths
1751
+ first_frame_path = os.path.join(final_dir, "first_frame.png")
1752
+ motion_path = os.path.join(final_dir, "motion_signal.mp4")
1753
+ mask_path = os.path.join(final_dir, "mask.mp4")
1754
+ base_title = subdir_name # for optional numpy save below
1755
+ npy_path = os.path.join(final_dir, f"{base_title}_polygons.npy")
1756
+
1757
+ fps = int(self.spn_fps.value())
1758
+ T_total = int(self.spn_total_frames.value())
1759
+
1760
+ # Build background (inpaint base regions belonging to non-external layers)
1761
+ H, W = self.canvas.base_bgr.shape[:2]
1762
+ total_mask = np.zeros((H, W), dtype=bool)
1763
+ for L in self.canvas.layers:
1764
+ if L.polygon_xy is None:
1765
+ continue
1766
+ if L.is_external:
1767
+ continue
1768
+ poly0 = L.polygon_xy.astype(np.int32)
1769
+ m = np.zeros((H, W), dtype=np.uint8)
1770
+ cv2.fillPoly(m, [poly0], 255)
1771
+ total_mask |= (m > 0)
1772
+ background = inpaint_background(self.canvas.base_bgr, total_mask)
1773
+
1774
+ # Collect animated frames for each layer (with hue crossfade as before)
1775
+ all_layer_frames = []
1776
+ layer_polys = [] # kept for the optional numpy block below
1777
+ for L in self.canvas.layers:
1778
+ if L.polygon_xy is None or len(L.keyframes) < 2:
1779
+ continue
1780
+
1781
+ def sample_keyframes_uniform_with_seg(keyframes: List[Keyframe], T: int):
1782
+ K = len(keyframes); assert K >= 1
1783
+ if K == 1:
1784
+ pos = np.repeat(keyframes[0].pos[None, :], T, axis=0).astype(np.float32)
1785
+ scl = np.full((T,), keyframes[0].scale, dtype=np.float32)
1786
+ rot = np.full((T,), keyframes[0].rot_deg, dtype=np.float32)
1787
+ seg_idx = np.zeros((T,), dtype=np.int32)
1788
+ t = np.zeros((T,), dtype=np.float32)
1789
+ return pos, scl, rot, seg_idx, t
1790
+ segs = K - 1
1791
+ u = np.linspace(0.0, float(segs), T, dtype=np.float32)
1792
+ seg_idx = np.minimum(np.floor(u).astype(int), segs - 1)
1793
+ t = u - seg_idx
1794
+ k0 = np.array([[keyframes[i].pos[0], keyframes[i].pos[1], keyframes[i].scale, keyframes[i].rot_deg] for i in seg_idx], dtype=np.float32)
1795
+ k1 = np.array([[keyframes[i+1].pos[0], keyframes[i+1].pos[1], keyframes[i+1].scale, keyframes[i+1].rot_deg] for i in seg_idx], dtype=np.float32)
1796
+ pos0 = k0[:, :2]; pos1 = k1[:, :2]
1797
+ s0 = np.maximum(1e-6, k0[:, 2]); s1 = np.maximum(1e-6, k1[:, 2])
1798
+ r0 = k0[:, 3]; r1 = k1[:, 3]
1799
+ pos = (1 - t)[:, None] * pos0 + t[:, None] * pos1
1800
+ scl = np.exp((1 - t) * np.log(s0) + t * np.log(s1))
1801
+ rot = (1 - t) * r0 + t * r1
1802
+ return pos.astype(np.float32), scl.astype(np.float32), rot.astype(np.float32), seg_idx, t
1803
+
1804
+ path_xy, scales, rots, seg_idx, t = sample_keyframes_uniform_with_seg(L.keyframes, T_total)
1805
+ origin_xy = L.origin_local_xy if L.origin_local_xy is not None else L.polygon_xy.mean(axis=0)
1806
+
1807
+ # Precompute one animation per keyframe hue
1808
+ K = len(L.keyframes)
1809
+ hue_values = [L.keyframes[k].hue_deg for k in range(K)]
1810
+ hue_to_frames: Dict[int, List[np.ndarray]] = {}
1811
+ polys_for_layer = None
1812
+ for k in range(K):
1813
+ bgr_h = apply_hue_shift_bgr(L.source_bgr, hue_values[k])
1814
+ frames_h, polys = animate_polygon(
1815
+ bgr_h, L.polygon_xy, path_xy, scales, rots,
1816
+ interp=cv2.INTER_LINEAR, origin_xy=origin_xy
1817
+ )
1818
+ hue_to_frames[k] = frames_h
1819
+ if polys_for_layer is None: # same polys for all hues
1820
+ polys_for_layer = np.array(polys, dtype=np.float32)
1821
+ if polys_for_layer is not None:
1822
+ layer_polys.append(polys_for_layer)
1823
+
1824
+ # Mix per frame using seg_idx / t
1825
+ frames_rgba = []
1826
+ for i in range(T_total):
1827
+ s = int(seg_idx[i])
1828
+ w = float(t[i])
1829
+ A = hue_to_frames[s][i].astype(np.float32)
1830
+ B = hue_to_frames[s+1][i].astype(np.float32)
1831
+ mix = (1.0 - w) * A + w * B
1832
+ frames_rgba.append(np.clip(mix, 0, 255).astype(np.uint8))
1833
+ all_layer_frames.append(frames_rgba)
1834
+
1835
+ if not all_layer_frames:
1836
+ QMessageBox.information(self, "Save", "No motion segments found. Add keyframes with ‘🎯 End Segment’.")
1837
+ self._set_instruction("Record at least one segment on a layer, then Save.")
1838
+ return
1839
+
1840
+ frames_out = composite_frames(background, all_layer_frames)
1841
+
1842
+ # Build mask frames (union of alpha across layers per frame)
1843
+ mask_frames = []
1844
+ for t in range(T_total):
1845
+ m = np.zeros((H, W), dtype=np.uint16)
1846
+ for Lframes in all_layer_frames:
1847
+ m += Lframes[t][:, :, 3].astype(np.uint16)
1848
+ m = np.clip(m, 0, 255).astype(np.uint8)
1849
+ mask_frames.append(m)
1850
+
1851
+ # --- Actual saving ---
1852
+ try:
1853
+ # first_frame.png (copy of the base image used for saving)
1854
+ cv2.imwrite(first_frame_path, self.canvas.base_bgr)
1855
+
1856
+ # motion_signal.mp4 = composited warped video
1857
+ save_video_mp4(frames_out, motion_path, fps=fps)
1858
+
1859
+ # mask.mp4 = grayscale mask video
1860
+ save_video_mp4([cv2.cvtColor(m, cv2.COLOR_GRAY2BGR) for m in mask_frames], mask_path, fps=fps)
1861
+
1862
+ # Optional: polygons.npy — disabled by default
1863
+ if False:
1864
+ # Pad and save polygons
1865
+ Vmax = 0
1866
+ for P in layer_polys:
1867
+ if P.size:
1868
+ Vmax = max(Vmax, P.shape[1])
1869
+
1870
+ def pad_poly(P: np.ndarray, Vmax_: int) -> np.ndarray:
1871
+ if P.size == 0:
1872
+ return np.zeros((T_total, Vmax_, 2), dtype=np.float32)
1873
+ T_, V, _ = P.shape
1874
+ out = np.zeros((T_, Vmax_, 2), dtype=np.float32)
1875
+ out[:, :V, :] = P
1876
+ if V > 0:
1877
+ out[:, V:, :] = P[:, V-1:V, :]
1878
+ return out
1879
+
1880
+ polys_uniform = np.stack([pad_poly(P, Vmax) for P in layer_polys], axis=0)
1881
+ np.save(npy_path, polys_uniform)
1882
+
1883
+ except Exception as e:
1884
+ QMessageBox.critical(self, "Save", f"Failed to save:\n{e}")
1885
+ return
1886
+
1887
+ QMessageBox.information(self, "Saved", f"Saved to:\n{final_dir}")
1888
+ self._set_instruction("Saved! You can keep editing, play demo again, or start a New project.")
1889
+
1890
+
1891
+ # ------------------------------
1892
+ # Entry
1893
+ # ------------------------------
1894
+
1895
+ def main():
1896
+ if sys.version_info < (3, 8):
1897
+ print("[Warning] PySide6 officially supports Python 3.8+. You're on %d.%d." % (sys.version_info.major, sys.version_info.minor))
1898
+ app = QApplication(sys.argv)
1899
+ w = MainWindow()
1900
+ w.show()
1901
+ sys.exit(app.exec())
1902
+
1903
+ if __name__ == "__main__":
1904
+ main()
LICENSE ADDED
@@ -0,0 +1,201 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Apache License
2
+ Version 2.0, January 2004
3
+ http://www.apache.org/licenses/
4
+
5
+ TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
6
+
7
+ 1. Definitions.
8
+
9
+ "License" shall mean the terms and conditions for use, reproduction,
10
+ and distribution as defined by Sections 1 through 9 of this document.
11
+
12
+ "Licensor" shall mean the copyright owner or entity authorized by
13
+ the copyright owner that is granting the License.
14
+
15
+ "Legal Entity" shall mean the union of the acting entity and all
16
+ other entities that control, are controlled by, or are under common
17
+ control with that entity. For the purposes of this definition,
18
+ "control" means (i) the power, direct or indirect, to cause the
19
+ direction or management of such entity, whether by contract or
20
+ otherwise, or (ii) ownership of fifty percent (50%) or more of the
21
+ outstanding shares, or (iii) beneficial ownership of such entity.
22
+
23
+ "You" (or "Your") shall mean an individual or Legal Entity
24
+ exercising permissions granted by this License.
25
+
26
+ "Source" form shall mean the preferred form for making modifications,
27
+ including but not limited to software source code, documentation
28
+ source, and configuration files.
29
+
30
+ "Object" form shall mean any form resulting from mechanical
31
+ transformation or translation of a Source form, including but
32
+ not limited to compiled object code, generated documentation,
33
+ and conversions to other media types.
34
+
35
+ "Work" shall mean the work of authorship, whether in Source or
36
+ Object form, made available under the License, as indicated by a
37
+ copyright notice that is included in or attached to the work
38
+ (an example is provided in the Appendix below).
39
+
40
+ "Derivative Works" shall mean any work, whether in Source or Object
41
+ form, that is based on (or derived from) the Work and for which the
42
+ editorial revisions, annotations, elaborations, or other modifications
43
+ represent, as a whole, an original work of authorship. For the purposes
44
+ of this License, Derivative Works shall not include works that remain
45
+ separable from, or merely link (or bind by name) to the interfaces of,
46
+ the Work and Derivative Works thereof.
47
+
48
+ "Contribution" shall mean any work of authorship, including
49
+ the original version of the Work and any modifications or additions
50
+ to that Work or Derivative Works thereof, that is intentionally
51
+ submitted to Licensor for inclusion in the Work by the copyright owner
52
+ or by an individual or Legal Entity authorized to submit on behalf of
53
+ the copyright owner. For the purposes of this definition, "submitted"
54
+ means any form of electronic, verbal, or written communication sent
55
+ to the Licensor or its representatives, including but not limited to
56
+ communication on electronic mailing lists, source code control systems,
57
+ and issue tracking systems that are managed by, or on behalf of, the
58
+ Licensor for the purpose of discussing and improving the Work, but
59
+ excluding communication that is conspicuously marked or otherwise
60
+ designated in writing by the copyright owner as "Not a Contribution."
61
+
62
+ "Contributor" shall mean Licensor and any individual or Legal Entity
63
+ on behalf of whom a Contribution has been received by Licensor and
64
+ subsequently incorporated within the Work.
65
+
66
+ 2. Grant of Copyright License. Subject to the terms and conditions of
67
+ this License, each Contributor hereby grants to You a perpetual,
68
+ worldwide, non-exclusive, no-charge, royalty-free, irrevocable
69
+ copyright license to reproduce, prepare Derivative Works of,
70
+ publicly display, publicly perform, sublicense, and distribute the
71
+ Work and such Derivative Works in Source or Object form.
72
+
73
+ 3. Grant of Patent License. Subject to the terms and conditions of
74
+ this License, each Contributor hereby grants to You a perpetual,
75
+ worldwide, non-exclusive, no-charge, royalty-free, irrevocable
76
+ (except as stated in this section) patent license to make, have made,
77
+ use, offer to sell, sell, import, and otherwise transfer the Work,
78
+ where such license applies only to those patent claims licensable
79
+ by such Contributor that are necessarily infringed by their
80
+ Contribution(s) alone or by combination of their Contribution(s)
81
+ with the Work to which such Contribution(s) was submitted. If You
82
+ institute patent litigation against any entity (including a
83
+ cross-claim or counterclaim in a lawsuit) alleging that the Work
84
+ or a Contribution incorporated within the Work constitutes direct
85
+ or contributory patent infringement, then any patent licenses
86
+ granted to You under this License for that Work shall terminate
87
+ as of the date such litigation is filed.
88
+
89
+ 4. Redistribution. You may reproduce and distribute copies of the
90
+ Work or Derivative Works thereof in any medium, with or without
91
+ modifications, and in Source or Object form, provided that You
92
+ meet the following conditions:
93
+
94
+ (a) You must give any other recipients of the Work or
95
+ Derivative Works a copy of this License; and
96
+
97
+ (b) You must cause any modified files to carry prominent notices
98
+ stating that You changed the files; and
99
+
100
+ (c) You must retain, in the Source form of any Derivative Works
101
+ that You distribute, all copyright, patent, trademark, and
102
+ attribution notices from the Source form of the Work,
103
+ excluding those notices that do not pertain to any part of
104
+ the Derivative Works; and
105
+
106
+ (d) If the Work includes a "NOTICE" text file as part of its
107
+ distribution, then any Derivative Works that You distribute must
108
+ include a readable copy of the attribution notices contained
109
+ within such NOTICE file, excluding those notices that do not
110
+ pertain to any part of the Derivative Works, in at least one
111
+ of the following places: within a NOTICE text file distributed
112
+ as part of the Derivative Works; within the Source form or
113
+ documentation, if provided along with the Derivative Works; or,
114
+ within a display generated by the Derivative Works, if and
115
+ wherever such third-party notices normally appear. The contents
116
+ of the NOTICE file are for informational purposes only and
117
+ do not modify the License. You may add Your own attribution
118
+ notices within Derivative Works that You distribute, alongside
119
+ or as an addendum to the NOTICE text from the Work, provided
120
+ that such additional attribution notices cannot be construed
121
+ as modifying the License.
122
+
123
+ You may add Your own copyright statement to Your modifications and
124
+ may provide additional or different license terms and conditions
125
+ for use, reproduction, or distribution of Your modifications, or
126
+ for any such Derivative Works as a whole, provided Your use,
127
+ reproduction, and distribution of the Work otherwise complies with
128
+ the conditions stated in this License.
129
+
130
+ 5. Submission of Contributions. Unless You explicitly state otherwise,
131
+ any Contribution intentionally submitted for inclusion in the Work
132
+ by You to the Licensor shall be under the terms and conditions of
133
+ this License, without any additional terms or conditions.
134
+ Notwithstanding the above, nothing herein shall supersede or modify
135
+ the terms of any separate license agreement you may have executed
136
+ with Licensor regarding such Contributions.
137
+
138
+ 6. Trademarks. This License does not grant permission to use the trade
139
+ names, trademarks, service marks, or product names of the Licensor,
140
+ except as required for reasonable and customary use in describing the
141
+ origin of the Work and reproducing the content of the NOTICE file.
142
+
143
+ 7. Disclaimer of Warranty. Unless required by applicable law or
144
+ agreed to in writing, Licensor provides the Work (and each
145
+ Contributor provides its Contributions) on an "AS IS" BASIS,
146
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
147
+ implied, including, without limitation, any warranties or conditions
148
+ of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
149
+ PARTICULAR PURPOSE. You are solely responsible for determining the
150
+ appropriateness of using or redistributing the Work and assume any
151
+ risks associated with Your exercise of permissions under this License.
152
+
153
+ 8. Limitation of Liability. In no event and under no legal theory,
154
+ whether in tort (including negligence), contract, or otherwise,
155
+ unless required by applicable law (such as deliberate and grossly
156
+ negligent acts) or agreed to in writing, shall any Contributor be
157
+ liable to You for damages, including any direct, indirect, special,
158
+ incidental, or consequential damages of any character arising as a
159
+ result of this License or out of the use or inability to use the
160
+ Work (including but not limited to damages for loss of goodwill,
161
+ work stoppage, computer failure or malfunction, or any and all
162
+ other commercial damages or losses), even if such Contributor
163
+ has been advised of the possibility of such damages.
164
+
165
+ 9. Accepting Warranty or Additional Liability. While redistributing
166
+ the Work or Derivative Works thereof, You may choose to offer,
167
+ and charge a fee for, acceptance of support, warranty, indemnity,
168
+ or other liability obligations and/or rights consistent with this
169
+ License. However, in accepting such obligations, You may act only
170
+ on Your own behalf and on Your sole responsibility, not on behalf
171
+ of any other Contributor, and only if You agree to indemnify,
172
+ defend, and hold each Contributor harmless for any liability
173
+ incurred by, or claims asserted against, such Contributor by reason
174
+ of your accepting any such warranty or additional liability.
175
+
176
+ END OF TERMS AND CONDITIONS
177
+
178
+ APPENDIX: How to apply the Apache License to your work.
179
+
180
+ To apply the Apache License to your work, attach the following
181
+ boilerplate notice, with the fields enclosed by brackets "[]"
182
+ replaced with your own identifying information. (Don't include
183
+ the brackets!) The text should be enclosed in the appropriate
184
+ comment syntax for the file format. We also recommend that a
185
+ file or class name and description of purpose be included on the
186
+ same "printed page" as the copyright notice for easier
187
+ identification within third-party archives.
188
+
189
+ Copyright [yyyy] [name of copyright owner]
190
+
191
+ Licensed under the Apache License, Version 2.0 (the "License");
192
+ you may not use this file except in compliance with the License.
193
+ You may obtain a copy of the License at
194
+
195
+ http://www.apache.org/licenses/LICENSE-2.0
196
+
197
+ Unless required by applicable law or agreed to in writing, software
198
+ distributed under the License is distributed on an "AS IS" BASIS,
199
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
200
+ See the License for the specific language governing permissions and
201
+ limitations under the License.
README.md ADDED
@@ -0,0 +1,169 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <h1 align="center">Time-to-Move</h1>
2
+ <h2 align="center">Training-Free Motion-Controlled Video Generation via Dual-Clock Denoising</h2>
3
+ <p align="center">
4
+ <a href="https://www.linkedin.com/in/assaf-singer/">Assaf Singer</a><sup>†</sup> ·
5
+ <a href="https://rotsteinnoam.github.io/">Noam Rotstein</a><sup>†</sup> ·
6
+ <a href="https://www.linkedin.com/in/amir-mann-a890bb276/">Amir Mann</a> ·
7
+ <a href="https://ron.cs.technion.ac.il/">Ron Kimmel</a> ·
8
+ <a href="https://orlitany.github.io/">Or Litany</a>
9
+ </p>
10
+ <p align="center"><sup>†</sup> Equal contribution</p>
11
+
12
+ <p align="center">
13
+ <a href="https://time-to-move.github.io/">
14
+ <img src="assets/logo_page.svg" alt="Project Page" width="125">
15
+ </a>
16
+ <a href="https://arxiv.org/abs/2511.08633">
17
+ <img src="assets/logo_arxiv.svg" alt="Arxiv" width="125">
18
+ </a>
19
+ <a href="https://arxiv.org/pdf/2511.08633">
20
+ <img src="assets/logo_paper.svg" alt="Paper" width="125">
21
+ </a>
22
+ </p>
23
+
24
+
25
+
26
+ <div align="center">
27
+ <img src="assets/teaser.gif" width="900" /><br/>
28
+ <span style="color: inherit; font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Helvetica, Arial, 'Noto Sans', sans-serif;">
29
+ <big><strong>Warped</strong></big>&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;
30
+ <big><strong>Ours</strong></big>&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;
31
+ <big><strong>Warped</strong></big>&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;
32
+ <big><strong>Ours</strong></big>
33
+ </span>
34
+ </div>
35
+
36
+ <br>
37
+
38
+ ## Table of Contents
39
+
40
+ - [Inference](#inference)
41
+ - [Dual Clock Denoising Guide](#dual-clock-denoising)
42
+ - [Wan](#wan)
43
+ - [CogVideoX](#cogvideox)
44
+ - [Stable Video Diffusion](#stable-video-diffusion)
45
+ - [Generate Your Own Cut-and-Drag Examples](#generate-your-own-cut-and-drag-examples)
46
+ - [GUI guide](GUIs/README.md)
47
+ - [TODO](#todo)
48
+ - [BibTeX](#bibtex)
49
+
50
+
51
+ ## Inference
52
+
53
+ **Time-to-Move (TTM)** is a plug-and-play technique that can be integrated into any image-to-video diffusion model.
54
+ We provide implementations for **Wan 2.2**, **CogVideoX**, and **Stable Video Diffusion (SVD)**.
55
+ As expected, the stronger the base model, the better the resulting videos.
56
+ Adapting TTM to new models and pipelines is straightforward and can typically be done in just a few hours.
57
+ We **recommend using Wan**, which generally produces higher‑quality results and adheres more faithfully to user‑provided motion signals.
58
+
59
+
60
+ For each model, you can use the [included examples](./examples/) or create your own as described in
61
+ [Generate Your Own Cut-and-Drag Examples](#generate-your-own-cut-and-drag-examples).
62
+
63
+ ### Dual Clock Denoising
64
+ TTM depends on two hyperparameters that start different regions at different noise depths. In practice, we do not pass `tweak` and `tstrong` as raw timesteps. Instead we pass `tweak-index` and `tstrong-index`, which indicate the iteration at which each denoising phase begins out of the total `num_inference_steps` (50 for all models).
65
+ Constraints: `0 ≤ tweak-index ≤ tstrong-index ≤ num_inference_steps`.
66
+
67
+ * **tweak-index** — when the denoising process **outside the mask** begins.
68
+ - Too low: scene deformations, object duplication, or unintended camera motion.
69
+ - Too high: regions outside the mask look static (e.g., non-moving backgrounds).
70
+ * **tstrong-index** — when the denoising process **within the mask** begins. In our experience, this depends on mask size and mask quality.
71
+ - Too low: object may drift from the intended path.
72
+ - Too high: object may look rigid or over-constrained.
73
+
74
+
75
+ ### Wan
76
+ To set up the environment for running Wan 2.2, follow the installation instructions in the official [Wan 2.2 repository](https://github.com/Wan-Video/Wan2.2). Our implementation builds on the [🤗 Diffusers Wan I2V pipeline](https://github.com/huggingface/diffusers/blob/345864eb852b528fd1f4b6ad087fa06e0470006b/src/diffusers/pipelines/wan/pipeline_wan_i2v.py)
77
+ adapted for TTM using the I2V 14B backbone.
78
+
79
+ #### Run inference (using the included Wan examples):
80
+ ```bash
81
+ python run_wan.py \
82
+ --input-path "./examples/cutdrag_wan_Monkey" \
83
+ --output-path "./outputs/wan_monkey.mp4" \
84
+ --tweak-index 3 \
85
+ --tstrong-index 7
86
+ ```
87
+
88
+ #### Good starting points:
89
+ * Cut-and-Drag: tweak-index=3, tstrong-index=7
90
+ * Camera control: tweak-index=2, tstrong-index=5
91
+
92
+ <br>
93
+
94
+ <details>
95
+ <summary><big><strong>CogVideoX</strong></big></summary><br>
96
+
97
+ To set up the environment for running CogVideoX, follow the installation instructions in the official [CogVideoX repository](https://github.com/zai-org/CogVideo).
98
+ Our implementation builds on the [🤗 Diffusers CogVideoX I2V pipeline](https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines/cogvideo/pipeline_cogvideox_image2video.py), which we adapt for Time-to-Move (TTM) using the CogVideoX-I2V 5B backbone.
99
+
100
+
101
+ #### Run inference (on the included 49-frame CogVideoX example):
102
+ ```bash
103
+ python run_cog.py \
104
+ --input-path "./examples/cutdrag_cog_Monkey" \
105
+ --output-path "./outputs/cog_monkey.mp4" \
106
+ --tweak-index 4 \
107
+ --tstrong-index 9
108
+ ```
109
+ </details>
110
+ <br>
111
+
112
+
113
+ <details>
114
+ <summary><big><strong>Stable Video Diffusion</strong></big></summary>
115
+ <br>
116
+
117
+ To set up the environment for running SVD, follow the installation instructions in the official [SVD repository](https://github.com/Stability-AI/generative-models).
118
+ Our implementation builds on the [🤗 Diffusers SVD I2V pipeline](https://github.com/huggingface/diffusers/blob/8abc7aeb715c0149ee0a9982b2d608ce97f55215/src/diffusers/pipelines/stable_video_diffusion/pipeline_stable_video_diffusion.py#L147
119
+ ), which we adapt for Time-to-Move (TTM).
120
+
121
+ #### To run inference (on the included 21-frame SVD example):
122
+ ```bash
123
+ python run_svd.py \
124
+ --input-path "./examples/cutdrag_svd_Fish" \
125
+ --output-path "./outputs/svd_fish.mp4" \
126
+ --tweak-index 16 \
127
+ --tstrong-index 21
128
+ ```
129
+ </details>
130
+ <br>
131
+
132
+ ## Generate Your Own Cut-and-Drag Examples
133
+ We provide an easy-to-use GUI for creating cut-and-drag examples that can later be used for video generation in **Time-to-Move**. We recommend reading the [GUI guide](GUIs/README.md) before using it.
134
+
135
+ <p align="center">
136
+ <img src="assets/gui.png" alt="Cut-and-Drag GUI Example" width="400">
137
+ </p>
138
+
139
+ To get started quickly, create a new environment and run:
140
+ ```bash
141
+ pip install PySide6 opencv-python numpy imageio imageio-ffmpeg
142
+ python GUIs/cut_and_drag.py
143
+ ```
144
+ <br>
145
+
146
+ ### TODO 🛠️
147
+
148
+ - [x] Wan 2.2 run code
149
+ - [x] CogVideoX run code
150
+ - [x] SVD run code
151
+ - [x] Cut-and-Drag examples
152
+ - [x] Camera-control examples
153
+ - [x] Cut-and-Drag GUI
154
+ - [x] Cut-and-Drag GUI guide
155
+ - [ ] Evaluation code
156
+
157
+
158
+ ## BibTeX
159
+ ```
160
+ @misc{singer2025timetomovetrainingfreemotioncontrolled,
161
+ title={Time-to-Move: Training-Free Motion Controlled Video Generation via Dual-Clock Denoising},
162
+ author={Assaf Singer and Noam Rotstein and Amir Mann and Ron Kimmel and Or Litany},
163
+ year={2025},
164
+ eprint={2511.08633},
165
+ archivePrefix={arXiv},
166
+ primaryClass={cs.CV},
167
+ url={https://arxiv.org/abs/2511.08633},
168
+ }
169
+ ```
assets/gui.png ADDED

Git LFS Details

  • SHA256: 6dfe3d202f383a64ff9c8756868f4f6a6d724b5a7a8ff0e5c93c8c4f546d3e43
  • Pointer size: 131 Bytes
  • Size of remote file: 224 kB
assets/logo_arxiv.svg ADDED
assets/logo_page.svg ADDED
assets/logo_paper.svg ADDED
assets/teaser.gif ADDED

Git LFS Details

  • SHA256: 7f96d0a7dfe267413e93e45f6c5f5fffb3329a81e3c95934674d5e8b9928e9b5
  • Pointer size: 132 Bytes
  • Size of remote file: 3.14 MB
docs/HUGGINGFACE.md ADDED
@@ -0,0 +1,137 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Hosting Time-to-Move on Hugging Face
2
+
3
+ This guide explains how to mirror the Time-to-Move (TTM) codebase on the 🤗 Hub and how to expose an interactive demo through a Space. It assumes you already read the main `README.md`, understand how to run `run_wan.py`, and have access to Wan 2.2 weights through your Hugging Face account.
4
+
5
+ ---
6
+
7
+ ## 1. Prerequisites
8
+
9
+ - Hugging Face account with access to the Wan 2.2 Image-to-Video model (`Wan-AI/Wan2.2-I2V-A14B-Diffusers` at the time of writing).
10
+ - Local environment with Git, Git LFS, Python 3.10+, and the `huggingface_hub` CLI.
11
+ - GPU-backed hardware both locally (for testing) and on Spaces (A100 or A10 is strongly recommended; CPU-only tiers are too slow for Wan 2.2).
12
+ - Optional: organization namespace on the Hugging Face Hub (recommended if you want to publish under a team/org).
13
+
14
+ Authenticate once locally (this stores a token in `~/.huggingface`):
15
+
16
+ ```bash
17
+ huggingface-cli login
18
+ git lfs install
19
+ ```
20
+
21
+ ---
22
+
23
+ ## 2. Publish the code as a model repository
24
+
25
+ 1. **Create an empty repo on the Hub.** Example:
26
+
27
+ ```bash
28
+ huggingface-cli repo create time-to-move/wan-ttm --type=model --yes
29
+ git clone https://huggingface.co/time-to-move/wan-ttm
30
+ cd wan-ttm
31
+ ```
32
+
33
+ 2. **Copy the TTM sources.** From the project root, copy the files that users need to reproduce inference:
34
+
35
+ ```bash
36
+ rsync -av \
37
+ --exclude ".git/" \
38
+ --exclude "outputs/" \
39
+ /path/to/TTM/ \
40
+ /path/to/wan-ttm/
41
+ ```
42
+
43
+ Make sure `pipelines/`, `run_wan.py`, `run_cog.py`, `run_svd.py`, `examples/`, and the new `huggingface_space/` folder are included. Track large binary assets:
44
+
45
+ ```bash
46
+ git lfs track "*.mp4" "*.png" "*.gif"
47
+ git add .gitattributes
48
+ ```
49
+
50
+ 3. **Add a model card.** Reuse the main `README.md` or create a shorter version describing:
51
+ - What Time-to-Move does.
52
+ - How to run `run_wan.py` with the `motion_signal` + `mask`.
53
+ - Which base model checkpoint the repo expects (Wan 2.2 I2V A14B).
54
+
55
+ 4. **Push to the Hub.**
56
+
57
+ ```bash
58
+ git add .
59
+ git commit -m "Initial commit of Time-to-Move Wan implementation"
60
+ git push
61
+ ```
62
+
63
+ Users can now do:
64
+
65
+ ```python
66
+ from huggingface_hub import snapshot_download
67
+ snapshot_download("time-to-move/wan-ttm")
68
+ ```
69
+
70
+ ---
71
+
72
+ ## 3. Prepare a Hugging Face Space (Gradio)
73
+
74
+ This repository now contains `huggingface_space/`, a ready-to-use Space template:
75
+
76
+ ```
77
+ huggingface_space/
78
+ ├── README.md # Quickstart instructions
79
+ ├── app.py # Gradio UI (loads Wan + Time-to-Move logic)
80
+ └── requirements.txt # Runtime dependencies
81
+ ```
82
+
83
+ ### 3.1 Create the Space
84
+
85
+ ```bash
86
+ huggingface-cli repo create time-to-move/wan-ttm-demo --type=space --sdk=gradio --yes
87
+ git clone https://huggingface.co/spaces/time-to-move/wan-ttm-demo
88
+ cd wan-ttm-demo
89
+ ```
90
+
91
+ Copy everything from `huggingface_space/` into the Space repository (or keep the whole repo and set the Space’s working directory accordingly). Commit and push.
92
+
93
+ ### 3.2 Configure hardware and secrets
94
+
95
+ - **Hardware:** Select an A100 (preferred) or A10 GPU runtime in the Space settings. Wan 2.2 is too heavy for CPUs.
96
+ - **WAN_MODEL_ID:** If you mirrored Wan 2.2 into your organization, set the environment variable to point to it. Otherwise leave the default (`Wan-AI/Wan2.2-I2V-A14B-Diffusers`).
97
+ - **HF_TOKEN / WAN_ACCESS_TOKEN:** Add a Space secret only if the Wan checkpoint is private. The Gradio app reads from `HF_TOKEN` automatically when calling `from_pretrained`.
98
+ - **PYTORCH_CUDA_ALLOC_CONF:** Recommended value `expandable_segments:True` to reduce CUDA fragmentation.
99
+
100
+ ### 3.3 How the app works
101
+
102
+ `huggingface_space/app.py` exposes:
103
+
104
+ - A dropdown of the pre-packaged `examples/cutdrag_wan_*` prompts.
105
+ - Optional custom uploads (`first_frame`, `mask.mp4`, `motion_signal.mp4`) following the README workflow.
106
+ - Sliders for `tweak-index`, `tstrong-index`, guidance scale, seed, etc.
107
+ - Live status messages and a generated MP4 preview using `diffusers.utils.export_to_video`.
108
+
109
+ The UI lazily loads the `WanImageToVideoTTMPipeline` with tiling/slicing enabled to reduce VRAM usage. All preprocessing matches the logic in `run_wan.py` (the same `compute_hw_from_area` helper is reused).
110
+
111
+ If you need to customize the experience (e.g., restrict to certain prompts, enforce shorter sequences), edit `huggingface_space/app.py` before pushing.
112
+
113
+ ---
114
+
115
+ ## 4. Testing checklist
116
+
117
+ 1. **Local dry-run.**
118
+ ```bash
119
+ pip install -r huggingface_space/requirements.txt
120
+ WAN_MODEL_ID=Wan-AI/Wan2.2-I2V-A14B-Diffusers \
121
+ python huggingface_space/app.py
122
+ ```
123
+ Ensure you can generate at least one of the bundled examples.
124
+
125
+ 2. **Space smoke test.**
126
+ - Open the deployed Space.
127
+ - Run the default example (`cutdrag_wan_Monkey`) and confirm you receive a video in ~2–3 minutes on A100 hardware.
128
+ - Optionally upload a small custom mask/video pair and verify that `tweak-index`/`tstrong-index` are honored.
129
+
130
+ 3. **Monitor logs.** Use the Space “Logs” tab to confirm:
131
+ - The pipeline downloads from the expected `WAN_MODEL_ID`.
132
+ - VRAM usage stays within the selected hardware tier.
133
+
134
+ 4. **Freeze dependencies.** When satisfied, tag the Space (`v1`, `demo`) so users know which TTM commit it matches.
135
+
136
+ You now have both a **model repository** (for anyone to clone/run) and a **public Space** for live demos. Feel free to adapt the instructions for the CogVideoX or Stable Video Diffusion pipelines if you plan to expose them as well; start by duplicating the provided Space template and swapping out `run_wan.py` for the relevant runner.
137
+
examples/camcontrol_Bridge/first_frame.png ADDED

Git LFS Details

  • SHA256: 6ee141276b8b202798b6bc727ca46a8f4b6202739464121108dc1304c3e40c10
  • Pointer size: 131 Bytes
  • Size of remote file: 706 kB
examples/camcontrol_Bridge/mask.mp4 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5f4670ba935c0d5dd9e5f1b3632c6b1c8ca45a2e2af80db623e953171c037c77
3
+ size 453479
examples/camcontrol_Bridge/motion_signal.mp4 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0c92b189953a948a73e0803a53a6e433e752de3bf0b57afd894a9329144310ce
3
+ size 1725802
examples/camcontrol_Bridge/prompt.txt ADDED
@@ -0,0 +1 @@
 
 
1
+ A stone bridge arches over a narrow river winding through a steep canyon. The camera flies low along the river, gliding beneath the bridge, where water sparkles and echoes against the rock walls. Slowly, the view rises upward, revealing vast green forests blanketing the surrounding hills and valleys. Warm sunlight filters through the trees, highlighting the lush textures of the landscape and adding depth to the sweeping forest view.
examples/camcontrol_ConcertCrowd/first_frame.png ADDED

Git LFS Details

  • SHA256: cf02fc1b2f175c09d0e2f2af93089dc55dfbb20ea7c7c48971821912b431e181
  • Pointer size: 131 Bytes
  • Size of remote file: 696 kB
examples/camcontrol_ConcertCrowd/mask.mp4 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:29652ba7927e0dea5744dc107b334d63cab72e0ebeb502ecb4bdcbf07d7cfcc4
3
+ size 989549
examples/camcontrol_ConcertCrowd/motion_signal.mp4 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4dafbcc4a0cdbceef618d8d179004f2093a68db370cc9a28fdbcc970def9b26a
3
+ size 4568859
examples/camcontrol_ConcertCrowd/prompt.txt ADDED
@@ -0,0 +1 @@
 
 
1
+ A massive crowd fills the arena, their energy palpable as they sway and cheer under the dazzling lights. Confetti flutters down from above, adding to the electric atmosphere. The audience, a sea of raised hands and excited faces, pulses with anticipation as the music builds. Security personnel stand at the forefront, ensuring safety while the crowd's enthusiasm grows. The stage is set for an unforgettable performance, with the audience fully immersed in the moment, ready to sing along and dance to the rhythm of the night.
examples/camcontrol_ConcertStage/first_frame.png ADDED

Git LFS Details

  • SHA256: f370134f1d4fec17ebd90ce1ba970c373127ac38a47ede8741e9c6edd00c4540
  • Pointer size: 131 Bytes
  • Size of remote file: 559 kB
examples/camcontrol_ConcertStage/mask.mp4 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:31c2037732bacd73d4d8d723270e042c1898eb4a5021bb9e16f8b08ff30a5686
3
+ size 845639
examples/camcontrol_ConcertStage/motion_signal.mp4 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2b6b5b4e3aa1e4b3f719a800a1bdc7ab2f4aab18943011d06b0264ed44bf4504
3
+ size 2643188
examples/camcontrol_ConcertStage/prompt.txt ADDED
@@ -0,0 +1 @@
 
 
1
+ The concert arena is electrified with energy as a band performs on stage, bathed in vibrant blue and purple lights. Flames shoot up dramatically on either side, adding to the intensity of the performance. The crowd is a sea of raised hands, swaying and cheering in unison, completely immersed in the music. In the foreground, a fan with an ecstatic expression captures the moment on their phone, while others around them shout and sing along. The atmosphere is charged with excitement and the pulsating rhythm of the music reverberates through the venue.
examples/camcontrol_RiverOcean/first_frame.png ADDED

Git LFS Details

  • SHA256: a8a931afdfebcf65dcf04a098146fb8922544eaea0fbea725bf8ee324efa92b5
  • Pointer size: 131 Bytes
  • Size of remote file: 669 kB
examples/camcontrol_RiverOcean/mask.mp4 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:993427e110aaf4ef4b319f7d35c27ccbab2ec42a90916ecaab508179cac3d426
3
+ size 528205
examples/camcontrol_RiverOcean/motion_signal.mp4 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:189edb6a3e5d5491c4849e12b1acaf994a9978b8004dc6cc5f9449b34e28752b
3
+ size 2424199
examples/camcontrol_RiverOcean/prompt.txt ADDED
@@ -0,0 +1 @@
 
 
1
+ A serene river winds its way through lush greenery, bordered by dense forests and sandy shores, before gracefully merging with the vast ocean. The gentle waves of the ocean lap against the rocky cliffs, creating a harmonious blend of land and sea. The sun casts a warm glow over the landscape, highlighting the vibrant colors of the foliage and the shimmering water. As the scene unfolds, birds soar above, and the gentle breeze rustles the leaves, adding a sense of tranquility to this picturesque meeting of river and ocean.
examples/camcontrol_SpiderMan/first_frame.png ADDED

Git LFS Details

  • SHA256: b5d3491afe98d04b32fba4b9b95301ebb3f787a96c0125f09c7e4ef14084d9e0
  • Pointer size: 131 Bytes
  • Size of remote file: 551 kB
examples/camcontrol_SpiderMan/mask.mp4 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c65b3cd42cbb622b702ab740b35546f342b7a9e4bf611b76ad9399f21b45f1a0
3
+ size 873522
examples/camcontrol_SpiderMan/motion_signal.mp4 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d2c98573f7a9c8b5716dcfc19b51a0193dae0d8aa28dfed23f2b62440321f90f
3
+ size 1821884
examples/camcontrol_SpiderMan/prompt.txt ADDED
@@ -0,0 +1 @@
 
 
1
+ A superhero in a red and blue suit swings gracefully through the towering skyscrapers of a bustling city. The sun sets in the distance, casting a warm golden glow over the urban landscape. As he releases his web, he flips through the air with agility, preparing to latch onto another building. The streets below are filled with the hustle and bustle of city life, while the hero moves effortlessly above, embodying a sense of freedom and adventure.
examples/camcontrol_Volcano/first_frame.png ADDED

Git LFS Details

  • SHA256: d668c88c9c081324cdcb5aeb74f9633759ef44ced0d80e587c033420ef61c5da
  • Pointer size: 131 Bytes
  • Size of remote file: 550 kB
examples/camcontrol_Volcano/mask.mp4 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0cb0974c3b9c055dfdded4b1081c65c068af133755d3dc0ce763e088fcbfebd8
3
+ size 612213
examples/camcontrol_Volcano/motion_signal.mp4 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:418fe3ec7394684193be49b6ecff9f5e4228eb667eb5ffa5a0050d8e260fd95a
3
+ size 1510493
examples/camcontrol_Volcano/prompt.txt ADDED
@@ -0,0 +1 @@
 
 
1
+ Under a starry night sky illuminated by the ethereal glow of the aurora borealis, a volcano erupts with fierce intensity. Molten lava spews high into the air, casting a fiery orange glow against the darkened landscape. The lava cascades down the sides of the volcano, forming glowing rivers that snake across the rugged terrain. Ash and smoke billow upwards, mingling with the vibrant colors of the northern lights, creating a dramatic and mesmerizing spectacle. The scene is both awe-inspiring and formidable, capturing the raw power and beauty of nature in its most elemental form.
examples/camcontrol_VolcanoTitan/first_frame.png ADDED

Git LFS Details

  • SHA256: 19f3f77f46324d180fa7c75466b45ff19a30261d01bf8bcb1ba4eb60c888a0f1
  • Pointer size: 131 Bytes
  • Size of remote file: 559 kB
examples/camcontrol_VolcanoTitan/mask.mp4 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7f1b619fff98a78b5e39dd590727540514d93fcafbe042933086c969b0791139
3
+ size 393530
examples/camcontrol_VolcanoTitan/motion_signal.mp4 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6a2b4efbf710f71124bf2033638097150b573e10511f1b714428b597aa74d9ed
3
+ size 2147868
examples/camcontrol_VolcanoTitan/prompt.txt ADDED
@@ -0,0 +1 @@
 
 
1
+ A colossal titan emerges from the molten heart of an erupting volcano, its body composed of dark, jagged rock with glowing veins of fiery lava. Lightning crackles across the stormy sky, illuminating the titan's menacing form as it rises with a slow, deliberate motion. Rivers of lava cascade down the volcano's slopes, carving fiery paths through the landscape. The titan's eyes burn with an intense, molten glow, and its massive hands grip the edges of the lava pool, sending tremors through the ground. As the eruption intensifies, the titan lets out a thunderous roar, echoing across the volcanic terrain.
examples/cutdrag_cog_Monkey/first_frame.png ADDED

Git LFS Details

  • SHA256: 0f445e634e3a9fd5145b82535c97b06886e327f2e46bbfbc87459829f40839aa
  • Pointer size: 131 Bytes
  • Size of remote file: 494 kB
examples/cutdrag_cog_Monkey/mask.mp4 ADDED
Binary file (33.3 kB). View file
 
examples/cutdrag_cog_Monkey/motion_signal.mp4 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:79eaff99c0e0322e7623029bf2211bff3ad4bbb53fdea511fc6dfa62c116514b
3
+ size 299834
examples/cutdrag_cog_Monkey/prompt.txt ADDED
@@ -0,0 +1 @@
 
 
1
+ A lively monkey energetically bounces on a neatly made bed, its limbs splayed in mid-air. As the monkey lands, the bed creases slightly under its weight, and it quickly prepares for another joyful leap, its eyes wide with excitement and mischief.
examples/cutdrag_svd_Fish/first_frame.png ADDED

Git LFS Details

  • SHA256: 67d7d2e837685604be229a94f65d087f714436fcef2f74360ffe3e07940d69a5
  • Pointer size: 131 Bytes
  • Size of remote file: 520 kB
examples/cutdrag_svd_Fish/mask.mp4 ADDED
Binary file (8.91 kB). View file
 
examples/cutdrag_svd_Fish/motion_signal.mp4 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:92c6343da8c5028390c522aeac9a5f3a258eaf09660f74f7c977c15e40aa3485
3
+ size 174696
examples/cutdrag_wan_Birds/first_frame.png ADDED

Git LFS Details

  • SHA256: 505a7edb31d6df4a3f4fcaeeb519dd97f11fec6c6e1d74a4da6fc43e2d7f5837
  • Pointer size: 131 Bytes
  • Size of remote file: 271 kB
examples/cutdrag_wan_Birds/mask.mp4 ADDED
Binary file (99.6 kB). View file
 
examples/cutdrag_wan_Birds/motion_signal.mp4 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:fae72bb1f6c2a284727ef267ce910b7c6b36492b69f1fe5c5483ae2a6a60d10a
3
+ size 482308
examples/cutdrag_wan_Birds/prompt.txt ADDED
@@ -0,0 +1 @@
 
 
1
+ As the sun sets, casting a warm glow across the sky, three birds glide gracefully through the air. A majestic eagle leads the way, its powerful wings outstretched, catching the last rays of sunlight. Beside it, a swift falcon darts with precision, its sleek form cutting through the gentle breeze. Below them, a swallow flits playfully, its agile movements creating a dance against the backdrop of rolling hills and silhouetted trees. The scene is serene, with the birds moving in harmony, painting a picture of freedom and grace against the vibrant hues of the evening sky.