As the five-day power struggle at OpenAI reaches its conclusion with Sam Altman’s reinstatement, Adobe is gearing up to enhance its generative AI capabilities. According to a report from the Economic Times, the software giant has internally announced its acquisition of Rephrase, a California-based company specializing in text-to-video technology.
While the exact financial details of the deal remain undisclosed, this move is poised to strengthen Adobe’s suite of Creative Cloud products, which have steadily incorporated generative AI improvements over the past year. In particular, Rephrase will enable Adobe to empower its customers to effortlessly produce professional-quality videos from text inputs.
CEO Ashray Malhotra of Rephrase disclosed the acquisition through a LinkedIn post but refrained from explicitly naming Adobe, referring to the acquiring entity as a “leading tech giant.” When pressed for further details, he cited limitations on sharing information at this stage.
What Rephrase brings to the table: Established in 2019 by Ashray Malhotra, Nisheeth Lahoti, and Shivam Mangla, Rephrase offers enterprises access to Rephrase Studio, a platform enabling users to create polished videos featuring digital avatars in mere minutes. The process involves selecting a video template, choosing an avatar along with the desired voice, and adding the necessary content.
Upon initiating the rendering process within Rephrase, the platform automatically combines all elements, synchronizing the script with the chosen avatar. Users can enhance their content’s naturalness through various customization options, such as resizing avatars, altering backgrounds, adjusting pauses between words, or incorporating custom audio.
Over the past four years, Rephrase has amassed over 50,000 customers and secured nearly $14 million in funding from multiple investors, including Red Ventures and Lightspeed India. Initially known for enabling enterprises and influencers to create custom avatars for personalized business videos, the acquisition will now bring these capabilities, along with a significant portion of the Rephrase team, into Adobe’s fold, bolstering their generative AI video offerings.
Ashley Still, Senior Vice President and General Manager for Adobe Creative Cloud, wrote in the internal memo, “The Rephrase.ai team’s expertise in generative AI video and audio technology, and experience-building text-to-video generator tools, will extend our generative video capabilities—and enable us to deliver more value to our customers faster— all within our industry-leading creative applications.”
When VentureBeat reached out to Adobe for comment, a spokesperson declined to provide additional insights into the development or how Rephrase’s tools will complement Adobe’s product portfolio.
Adobe’s Strong Embrace of AI: In recent months, Adobe has been at the forefront of advancing generative AI with several product updates. It introduced Firefly, an AI engine for image generation, which was integrated across Creative Cloud products like Photoshop. This innovation allowed users to manipulate images by describing changes in plain text.
Furthermore, at its annual Max conference last month, Adobe showcased various experimental generative AI-powered video features, including upscaling videos, changing textures and objects through text prompts, and compositing subjects and scenes from separate videos. While the timeline for the incorporation of these features into future releases remains uncertain, Rephrase’s digital avatar-based capabilities appear to be a promising addition.
Ashray Malhotra expressed his excitement for the future of Generative AI, emphasizing that it’s still in its early stages. Adobe Creative Cloud, known for decades as the dominant platform for digital art and media, currently offers six main products for audio and video-related work: Premiere Pro, After Effects, Audition, Character Animator, Animate, and Media Encoder. These tools are used by both professionals and amateurs to create, edit, and share digital content, leaving a lasting impact on online communities and trends through countless memes, parodies, and viral art.
In the midst of the ongoing power struggle and mass resignations at OpenAI, Microsoft, a long-standing supporter of AI research, continues to advance its AI initiatives without slowing down. Today, the research division of Microsoft, led by Satya Nadella, unveiled Orca 2, a pair of compact language models that demonstrate remarkable performance, surpassing much larger language models, including Meta’s Llama-2 Chat-70B, in complex reasoning tasks conducted under zero-shot conditions.
These models come in two sizes, with 7 billion and 13 billion parameters, building upon the groundwork laid by the original 13B Orca model, which had already showcased strong reasoning capabilities by emulating the step-by-step reasoning processes of more powerful models a few months ago.
In a joint blog post, Microsoft researchers stated, “With Orca 2, we continue to demonstrate that improved training techniques and signals can empower smaller language models to achieve enhanced reasoning capabilities, typically associated with much larger models.”
Microsoft has made both of these new models open-source to encourage further research into the development and evaluation of smaller models that can match the performance of their larger counterparts. This initiative provides enterprises, especially those with limited resources, a more cost-effective option to address their specific use cases without the need for extensive computing resources.
While large language models like GPT-4 have long impressed both enterprises and individuals with their reasoning abilities and complex question answering, smaller models have often fallen short in this regard. Microsoft Research aimed to bridge this gap by fine-tuning Llama 2 base models using a highly customized synthetic dataset.
Instead of simply replicating the behavior of more capable models through imitation learning, as is commonly done, the researchers trained these models to employ various solution strategies tailored to specific tasks. The rationale behind this approach was that strategies designed for larger models might not always work optimally for smaller ones. For instance, while GPT-4 can directly answer complex questions, a smaller model may benefit from breaking down the same task into multiple steps.
“In Orca 2, we teach the model various reasoning techniques (step-by-step, recall then generate, recall-reason-generate, direct answer, etc.). More crucially, we aim to help the model learn to determine the most effective solution strategy for each task,” the researchers explained in a recently published paper. The training data for this project was obtained from a more capable teacher model in a way that guided the student model on when and how to use reasoning strategies for specific tasks.
When evaluated on 15 diverse benchmarks under zero-shot conditions, covering aspects like language comprehension, common-sense reasoning, multi-step reasoning, math problem solving, reading comprehension, summarization, and truthfulness, Orca 2 models delivered impressive results, often matching or surpassing models five to ten times their size.
The overall average of benchmark results indicated that Orca 2 7B and 13B outperformed Llama-2-Chat-13B and 70B, as well as WizardLM-13B and 70B. Only in the GSM8K benchmark, which included 8.5K high-quality grade school math problems, did WizardLM-70B perform notably better than the Orca and Llama models.
It’s worth noting that, despite their outstanding performance, these models may still inherit certain limitations common to other language models and the base model upon which they were fine-tuned.
Microsoft also highlighted that the techniques used to create the Orca models can potentially be applied to other base models in the field.
Future Prospects: Despite some limitations, Microsoft sees great potential for future advancements in areas such as improved reasoning, specialization, control, and safety of smaller language models. Leveraging carefully filtered synthetic data for post-training emerges as a key strategy for these improvements. As larger models continue to excel, the work on Orca 2 represents a significant step in diversifying the applications and deployment options of language models, according to the research team.
With the release of the open-source Orca 2 models and ongoing research in this space, it is likely that we will see more high-performing, compact language models emerge in the near future. Recent developments in the AI community, such as the release of a 34-billion parameter model by China’s 01.AI and Mistral AI’s 7 billion parameter model, demonstrate the growing interest in smaller, yet highly capable language models that can rival their larger counterparts.
Nvidia is strengthening its collaborative approach with Microsoft. During the Ignite conference, hosted by the Satya Nadella-led tech giant, Nvidia unveiled an AI foundry service aimed at assisting both enterprises and startups in building custom AI applications on the Azure cloud. These applications can leverage enterprise data through retrieval augmented generation (RAG) technology.
Jensen Huang, Nvidia’s founder and CEO, highlighted, “Nvidia’s AI foundry service combines our generative AI model technologies, LLM training expertise, and a massive AI factory. We built this service on Microsoft Azure, enabling enterprises worldwide to seamlessly integrate their custom models with Microsoft’s top-tier cloud services.”
In addition to this, Nvidia also introduced new 8-billion parameter models, which are part of the foundry service. They also announced their plan to incorporate their next-gen GPU into Microsoft Azure in the coming months.
So, how will the AI foundry service benefit Azure users? With Nvidia’s AI foundry service on Azure, cloud-based enterprises will gain access to all the essential components needed to create custom, business-focused generative AI applications in one place. This comprehensive offering includes Nvidia’s AI foundation models, the NeMo framework, and the Nvidia DGX cloud supercomputing service.
Manuvir Das, the VP of enterprise computing at Nvidia, emphasized, “For the first time, this entire process, from hardware to software, is available end to end on Microsoft Azure. Any customer can come and execute the entire enterprise generative AI workflow with Nvidia on Azure. They can procure the necessary technology components right within Azure. Simply put, it’s a collaborative effort between Nvidia and Microsoft.”
To provide enterprises with a wide range of foundation models for use with the foundry service in Azure environments, Nvidia is introducing a new family of Nemotron-3 8B models. These models support the creation of advanced enterprise chat and Q&A applications for sectors like healthcare, telecommunications, and financial services. They come with multilingual capabilities and will be accessible through the Azure AI model catalog, Hugging Face, and the Nvidia NGC catalog.
Among the other foundation models available in the Nvidia catalog are Llama 2 (also coming to the Azure AI catalog), Stable Diffusion XL, and Mistral 7b.
Once users have chosen their preferred model, they can move on to the training and deployment stage for custom applications using Nvidia DGX Cloud and AI Enterprise software, both of which are available through the Azure marketplace. DGX Cloud provides customers with scalable instances and includes the AI Enterprise toolkit, featuring the NeMo framework and Nvidia Triton Inference Server, enhancing Azure’s enterprise-grade AI service for faster LLM customization.
Nvidia noted that this toolkit is also available as a separate product on the marketplace, allowing users to utilize their existing Microsoft Azure Consumption Commitment credits to expedite model development.
Notably, Nvidia recently announced a similar partnership with Oracle, offering eligible enterprises the option to purchase these tools directly from the Oracle Cloud marketplace for training models and deployment on the Oracle Cloud Infrastructure (OCI).
Currently, early users of the foundry service on Azure include major software companies like SAP, Amdocs, and Getty Images. They are testing and building custom AI applications targeting various use cases.
Beyond the generative AI service, Microsoft and Nvidia have expanded their partnership to include the chipmaker’s latest hardware offerings. Microsoft unveiled new NC H100 v5 virtual machines for Azure, the first cloud instances in the industry featuring a pair of PCIe-based H100 GPUs connected via Nvidia NVLink. These machines provide nearly four petaflops of AI compute power and 188GB of faster HBM3 memory.
The Nvidia H100 NVL GPU offers up to 12 times higher performance on GPT-3 175B compared to the previous generation, making it suitable for inference and mainstream training workloads.
Furthermore, Microsoft plans to add the new Nvidia H200 Tensor Core GPU to its Azure fleet in the upcoming year. This GPU offers 141GB of HBM3e memory (1.8 times more than its predecessor) and 4.8 TB/s of peak memory bandwidth (a 1.4 times increase). It is designed for handling large AI workloads, including generative AI training and inference, and provides Azure users with multiple options for AI workloads alongside Microsoft’s new Maia 100 AI accelerator.
To accelerate LLM work on Windows devices, Nvidia announced several updates, including an update for TensorRT LLM for Windows. This update introduces support for new large language models like Mistral 7B and Nemotron-3 8B and delivers five times faster inference performance. These improvements make running these models smoother on desktops and laptops equipped with GeForce RTX 30 Series and 40 Series GPUs with at least 8GB of RAM.
Nvidia also mentioned that TensorRT-LLM for Windows will be compatible with OpenAI’s Chat API through a new wrapper, enabling hundreds of developer projects and applications to run locally on a Windows 11 PC with RTX, rather than relying on cloud-based infrastructure.
Interplay, a venture capital firm headquartered in New York, has successfully completed its third funding round, amassing $45 million. This latest fund follows two earlier rounds focusing on early-stage investments, particularly in software sectors like B2B marketplaces and specialized vertical software. We previously reported on Interplay in 2022 during its separate funding initiative.
Mark Peter Davis, the founder and managing partner at Interplay, highlighted in a conversation with TechCrunch the firm’s interest in companies revolutionizing previously un-digitized markets due to unfavorable economics. According to Davis, the recent trend is a move towards specialized services. Newer companies are challenging established broad-spectrum platforms by offering services more finely tuned to specific industries. This strategy has proven successful for Interplay, shaping the investment philosophy of their current fund.
Interplay’s initial fund operated on a small scale, akin to angel investing. However, the second fund marked a shift with external limited partners’ involvement. The third fund, distinct in its approach, attracts institutional investors, including funds of funds, family offices, and founders from Interplay’s own portfolio.
Davis outlines Interplay’s distinct qualities. Firstly, the consistency in their team of general partners, including Davis himself, Kevin Tung, and Mike Rogers, who have a collaborative investment history of over eight years. Secondly, their ability to offer significant value relative to their investment size. Lastly, their unique studio model, which fosters company incubation and creation, enhancing their deal flow.
With the latest fund, Interplay’s total assets under management reach $150 million. The plan is to invest in around 20 companies, allocating $1 to $2 million per investment, reserving funds for subsequent investments. Already, 40% of the fund has been invested, including recent investments in two construction tech firms, OnSiteIQ and Roofr.
2023 presented challenges in fundraising for both companies and venture capital firms. Davis acknowledges the difficult climate, yet praises his team’s achievement against market odds, attributing it to their decade-long dedication.
In recent times, advancing to a Series A funding round has been particularly challenging. Davis agrees that market fluctuations have impacted this stage, but notes that many promising companies are raising capital at what he deems “reasonable valuations.” Despite the lure of higher valuations during the investment surge, Interplay has remained disciplined in its capital allocation, often passing on opportunities reflective of market over-enthusiasm.
Davis finds the current market appealing for investments, as company valuations have realigned with what Interplay considers reasonable. He acknowledges the potential issues for entrepreneurs in cases of overcorrection in valuation but believes that fair valuations set the stage for sustained company success.