Skip to main content

Show HN: AI prompt-to-storyboard videos w/ GPT, Coqui voices, StabilityAI images https://ift.tt/LYxotTU

Show HN: AI prompt-to-storyboard videos w/ GPT, Coqui voices, StabilityAI images I had 2 weeks off from work and wanted a pet project before heading back. With GPT and Generative AI in the news, I decided to chain multiple Al products together to build something really cool. I set my end goal to be: prompt-to-storyboard (aka fun videos generated purely via generative AI). There exists some prompt-to-video products, but I wanted to tell stories with audio as well. The end product takes an initial prompt and produces a series of images and audio files, which I then combine (with subtitles) into the final video. To showcase videos, there is a basic upvote/downvote leaderboard. Text | OpenAI https://openai.com/ Text is generated in a few high-level steps that I ask GPT to work through. These are all based on the initial user prompt, as such (ideally) indirectly controlled by the user. - Create a concept for a movie scene based on the prompt, including the theme and setting - Define each character in the scene - Define how each character looks - Define how each character sounds - Define 'frames' of the storyboard All of this textual information is defined in a JSON object I describe to GPT. I then take GPT's output and build the storyboard with the tools below. Voices | Coqui https://coqui.ai/ From the GPT output, I needed three major pieces of information to build voices in a way that I found satisfying: - Description of the voice - Description of the performance - Text of the actual dialog spoken Coqui has a product called 'prompt-to-voice', where you can describe how a character will sound and a custom voice is made for that character - this is how GPT defines the characters to use in the storyboard. As such, every voice is unique per storyboard. GPT will decide that a certain character is an "older man with a raspy voice", and I'll ask Coqui for that type of voice. In addition to this, in order to describe the performance, GPT outputs a basic emotion to summarize the line of dialog (happy, sad, angry, etc) - this is also sent to Coqui per audio clip generated. Images | Stability AI https://stability.ai/ While I originally setup the storyboard generator to use DALL-E due to already integrating with OpenAl for GPI, I found the cost prohibitive. As such, the images generated for the storyboards are from Stability Al's Stable Diffusion (stable-diffusion-512-v2-1). I combine the description of the frame that GPT provides, in addition to the theme and setting that GPT output for the whole storyboard, to generate each frame. Since GPT controls the data sent to Stable Diffusion with the description of the frame as well as the theme and setting, if your prompt dictates a theme it should hopefully translate into a theme in your storyboard. Both the storyboard and the 'prompt enhanced' image generation in the 'Create Content' tab pre-feed a GPT request with a summary of Stability Al's prompt guide. It will try and pick keyword weights to improve the image, and much like the setting and theme, keywords should be influenced by the initial prompt provided to the product. Conclusion: Have fun and make my 2 weeks of work seem worth it! Voting on storyboards and creating storyboards both require a simple Google login to get access. https://meyer.id April 21, 2023 at 06:05AM

Comments

Popular posts from this blog

Women Pioneers at Muni: Adeline Svendsen and Muni’s First Newsletter

Women Pioneers at Muni: Adeline Svendsen and Muni’s First Newsletter By Jeremy Menzies To close out Women’s History Month, here’s a look back at one woman whose work to bring Muni staff together in the late 1940s created a legacy that lives on to this day. Adeline “Addy” Svendsen was founding editor of Muni’s first internal newsletter, “ Trolley Topics .” Adeline Svendsen sits at her desk in the Geneva Carhouse office building in this 1949 shot. Trolley Topics was a new venture when it started in February 1946. As Svendsen wrote in the first issue it was created, “to bring a little fun, a little news, and a lot of good will to all our fellow employees in the Railway.” Just two years prior in 1944, Muni merged with the Market Street Railway Company, expanding the small municipal operation into the largest transit provider in the city with hundreds of employees, vehicles of every shape and size, and dozens of facilities scattered across town. The newsletter was meant to help unite ...

Show HN: StreetComplete, an OpenStreetMap Editor for Humans https://ift.tt/2J8IL02

Show HN: StreetComplete, an OpenStreetMap Editor for Humans StreetComplete is an OpenStreetMap[0] editor directed at people who want to contribute and want to do this using their smartphone, without learning how to edit things[1]. It is available as an Android application. It is intended to be used as one walks, with quests appearing as markers on the map. Selecting a marker allows one to answer a simple question. The answer will be added to the OpenStreetMap database, with app handling selecting objects for editing, transforming answer into OSM tags and making edits. OpenStreetMap account is needed to apply edits, but it is possible to start without it, make some edits and login/register later. Note: I am not the main author, but I am one of the active contributors. Github page is at https://ift.tt/2g8lasH and https://ift.tt/3nR9PzS shows what was recently released. [0]OpenStreetMap is a Wikipedia of maps, available on the open licence. This dataset is already used for many interestin...

Show HN: Launch VM workloads securely and instantaneously, without VMs https://ift.tt/2QwJ1Kd

Show HN: Launch VM workloads securely and instantaneously, without VMs Hello HN! We've been working on a new hypervisor https://kwarantine.xyz that can run strongly isolated containers. This is still a WIP, but we wanted to give the community an idea about our approach, its benefits, and various use cases it unlocks. Today, VMs are used to host containers, and make up for the lack of strong security as well as kernel isolation in containers. This work adds this missing security piece in containers. We plan on launching a free private beta soon. Meanwhile, we'd deeply appreciate any feedback, and happy to answer any questions here or on our slack channel. Thanks! April 29, 2021 at 07:50AM