add whisper-asr-webservice

update piper
2024-03-27 16:16:14 +08:00 · 2024-03-26 17:19:18 +08:00 · 2024-03-25 18:25:57 +08:00 · 2024-03-25 17:48:08 +08:00 · 2024-03-13 15:07:00 +08:00 · 2024-03-10 12:18:53 +08:00
10 changed files with 177 additions and 0 deletions
--- a/README.md
+++ b/README.md
@ -177,6 +177,7 @@ A collection of delicious docker recipes.
 - [x] obs-web-arm :joystick:
 - [x] openmeetings :camera:
 - [x] paddle-ocr
+- [x] piper
 - [x] plex :moneybag:
 - [x] red5 :+1: :camera:
 - [x] red5-arm :construction: :camera:
@ -428,6 +429,7 @@ A collection of delicious docker recipes.
 - [x] ohmyform
  - [x] api
  - [x] ui
+- [x] onerahmet/openai-whisper-asr-webservice
 - [x] osixia/openldap
 - [x] openresty/openresty
 - [x] opensearchproject/opensearch :bucket:
@ -455,6 +457,7 @@ A collection of delicious docker recipes.
 - [x] prosody/prosody
 - [x] redis/redis-stack
 - [x] registry
+- [x] danielgatis/rembg
 - [x] datarhei/restreamer
 - [x] restic/rest-server
 - [x] rocker/rstudio
@ -487,6 +490,7 @@ A collection of delicious docker recipes.
 - [x] teamatldocker
    - [x] confluence
    - [x] jira
+- [x] openstitching/stitch
 - [x] strapi/strapi
 - [x] amancevice/superset
 - [x] matrixdotorg/synapse
@ -498,6 +502,7 @@ A collection of delicious docker recipes.
 - [x] traccar/traccar
 - [x] traefik
 - [x] trinodb/trino
+- [x] ghcr.io/coqui-ai/tts-cpu
 - [x] louislam/uptime-kuma
 - [x] v2ray/official :cn:
 - [x] mpromonet/v4l2rtspserver :camera:
--- a/piper/Dockerfile
+++ b/piper/Dockerfile
@ -0,0 +1,37 @@
+#
+# Dockerfile for piper
+#
+
+FROM debian:12
+MAINTAINER EasyPi Software Foundation
+
+ARG PIPER_VERSION=2023.11.14-2
+ARG PIPER_OS=linux
+ARG PIPER_ARCH=x86_64
+ARG PIPER_FILE=piper_${PIPER_OS}_${PIPER_ARCH}.tar.gz
+ARG PIPER_URL=https://github.com/rhasspy/piper/releases/download/${PIPER_VERSION}/${PIPER_FILE}
+
+ARG MODEL_BASE_URL=https://huggingface.co/rhasspy/piper-voices/resolve/v1.0.0/en/en_US
+ARG MODEL_VOICES=amy,arctic,danny,hfc_female,hfc_male,joe,kathleen,kristin,kusal,l2arctic,lessac,libritts,libritts_r,ljspeech,ryan
+ARG MODEL_QUALITY=medium
+
+WORKDIR /opt/piper
+
+RUN set -xe \
+ && apt update -y \
+ && apt install -y curl \
+ && curl -sSL ${PIPER_URL} | tar xz --strip 1 \
+ && mkdir models \
+ && cd models \
+ && echo ${MODEL_VOICES} | tr ',' '\n' | while read voice; \
+    do \
+      model_url=${MODEL_BASE_URL}/${voice}/${MODEL_QUALITY}/en_US-${voice}-${MODEL_QUALITY}.onnx; \
+      curl -sSL -O ${model_url} -O ${model_url}.json; \
+    done \
+ && cd .. \
+ && ./piper --version \
+ && apt remove -y curl \
+ && rm -rf /var/lib/apt/lists/*
+
+ENTRYPOINT ["/opt/piper/piper"]
+CMD ["-m", "/opt/piper/models/en_US-lessac-medium.onnx", "-d", "/tmp"]
--- a/piper/README.md
+++ b/piper/README.md
@ -0,0 +1,19 @@
+piper
+=====
+
+[piper][1] is a fast, local neural text to speech system that sounds great and is optimized for the Raspberry Pi 4.
+
+```bash
+# Create an alias
+$ alias piper='docker run -i --rm -u $(id -u):$(id -g) -v $PWD:/tmp vimagick/piper -m /opt/piper/models/en_US-amy-medium.onnx'
+
+# Do text-to-speech
+$ echo 'Welcome to the world of speech synthesis!' | piper -f /tmp/welcome.wav
+
+# Play audio
+$ play welcome.wav
+```
+
+List of voices: https://rhasspy.github.io/piper-samples/
+
+[1]: https://github.com/rhasspy/piper
--- a/rembg/README.md
+++ b/rembg/README.md
@ -0,0 +1,32 @@
+rembg
+=====
+
+[Rembg][1] is a tool to remove images background.
+
+## Web Service
+
+```bash
+$ docker compose up -d
+$ url=https://raw.githubusercontent.com/danielgatis/rembg/master/examples/girl-3.jpg
+$ curl -sSL $url -o input.jpg
+$ curl -s -G http://localhost:7000/api/remove -d url=$url -o output.png
+$ curl -s http://localhost:7000/api/remove -F file=@input.jpg -o output.png
+```
+
+## Ad Hoc Commands
+
+```bash
+# Create an alias
+$ alias rembg='docker run --rm -u $(id -u):$(id -g) -v $PWD:/rembg danielgatis/rembg:2'
+
+# Remove the background from a local file
+$ rembg i input.png output.png
+
+# Remove the background returning only the mask
+$ rembg i -om input.png output.png
+
+# Remove the background applying an alpha matting
+$ rembg i -a input.png output.png
+```
+
+[1]: https://github.com/danielgatis/rembg
--- a/rembg/docker-compose.yml
+++ b/rembg/docker-compose.yml
@ -0,0 +1,8 @@
+version: "3.8"
+services:
+  rembg:
+    image: danielgatis/rembg:2
+    command: s --host 0.0.0.0 --port 7000 --log_level info
+    ports:
+      - "7000:7000"
+    restart: unless-stopped
--- a/stitching/README.md
+++ b/stitching/README.md
@ -0,0 +1,12 @@
+stitching
+=========
+
+[stitching][1] is a Python package for fast and robust Image Stitching.
+
+```bash
+$ alias stitch='docker run --rm -v $PWD:/data openstitching/stitch'
+
+$ stitch *.jpg
+```
+
+[1]: https://github.com/OpenStitching/stitching
--- a/tts/README.md
+++ b/tts/README.md
@ -0,0 +1,13 @@
+TTS
+===
+
+[TTS][1] - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
+
+```bash
+$ docker compose up -d
+$ docker compose exec tts bash
+>>> python3 TTS/server/server.py --list_models
+>>> exit
+```
+
+[1]: https://github.com/coqui-ai/TTS
--- a/tts/docker-compose.yml
+++ b/tts/docker-compose.yml
@ -0,0 +1,12 @@
+version: "3.8"
+services:
+  tts:
+    image: ghcr.io/coqui-ai/tts-cpu:v0.22.0
+    entrypoint: ["python3"]
+    command: |
+      TTS/server/server.py
+      --model_name tts_models/en/vctk/vits
+      --extra_model_name tts_models/en/ljspeech/tacotron2-DDC_ph
+    ports:
+      - "5002:5002"
+    restart: unless-stopped
--- a/whisper-asr-webservice/README.md
+++ b/whisper-asr-webservice/README.md
@ -0,0 +1,26 @@
+whisper-asr-webservice
+======================
+
+[Whisper ASR Webservice][1] is a free transcription service powered by Whisper AI.
+
+It supports following whisper models:
+
+- [openai/whisper](https://github.com/openai/whisper)
+- [SYSTRAN/faster-whisper](https://github.com/SYSTRAN/faster-whisper)
+
+## Server
+
+```bash
+$ docker compose up -d
+$ curl http://127.0.0.1:9000/docs
+```
+
+## Client
+
+```bash
+$ wget -O audio.wav https://github.com/rhasspy/piper/raw/master/notebooks/wav/en/success.wav
+$ curl -F audio_file=@audio.wav "http://127.0.0.1:9000/asr?task=transcribe&output=srt"
+$ curl -F audio_file=@audio.wav "http://127.0.0.1:9000/detect-language"
+```
+
+[1]: https://github.com/ahmetoner/whisper-asr-webservice
--- a/whisper-asr-webservice/docker-compose.yml
+++ b/whisper-asr-webservice/docker-compose.yml
@ -0,0 +1,13 @@
+version: "3.8"
+services:
+  asr:
+    image: onerahmet/openai-whisper-asr-webservice
+    ports:
+      - "9000:9000"
+    volumes:
+      - ./data:/data
+    environment:
+      - ASR_MODEL=medium
+      - ASR_ENGINE=faster_whisper
+      - ASR_MODEL_PATH=/data
+    restart: unless-stopped
Author	SHA1	Message	Date
kev	e6e31fd68e	add whisper-asr-webservice	2024-03-27 16:16:14 +08:00
kev	a529d5e527	update piper	2024-03-26 17:19:18 +08:00
kev	b5d3951330	update piper	2024-03-25 18:25:57 +08:00
kev	bb09770e4d	update piper	2024-03-25 17:48:08 +08:00
kev	39f1db6f3d	add stitching	2024-03-13 15:07:00 +08:00
kevin	95187f7f7f	add piper	2024-03-10 12:18:53 +08:00
kevin	df5154a338	add tts	2024-03-09 18:00:29 +08:00
kev	f69a5892cc	add rembg	2024-03-08 17:26:37 +08:00