Compare commits

...

8 Commits

Author SHA1 Message Date
kev e6e31fd68e add whisper-asr-webservice 2024-03-27 16:16:14 +08:00
kev a529d5e527 update piper 2024-03-26 17:19:18 +08:00
kev b5d3951330 update piper 2024-03-25 18:25:57 +08:00
kev bb09770e4d update piper 2024-03-25 17:48:08 +08:00
kev 39f1db6f3d add stitching 2024-03-13 15:07:00 +08:00
kevin 95187f7f7f add piper 2024-03-10 12:18:53 +08:00
kevin df5154a338 add tts 2024-03-09 18:00:29 +08:00
kev f69a5892cc add rembg 2024-03-08 17:26:37 +08:00
10 changed files with 177 additions and 0 deletions

View File

@ -177,6 +177,7 @@ A collection of delicious docker recipes.
- [x] obs-web-arm :joystick:
- [x] openmeetings :camera:
- [x] paddle-ocr
- [x] piper
- [x] plex :moneybag:
- [x] red5 :+1: :camera:
- [x] red5-arm :construction: :camera:
@ -428,6 +429,7 @@ A collection of delicious docker recipes.
- [x] ohmyform
- [x] api
- [x] ui
- [x] onerahmet/openai-whisper-asr-webservice
- [x] osixia/openldap
- [x] openresty/openresty
- [x] opensearchproject/opensearch :bucket:
@ -455,6 +457,7 @@ A collection of delicious docker recipes.
- [x] prosody/prosody
- [x] redis/redis-stack
- [x] registry
- [x] danielgatis/rembg
- [x] datarhei/restreamer
- [x] restic/rest-server
- [x] rocker/rstudio
@ -487,6 +490,7 @@ A collection of delicious docker recipes.
- [x] teamatldocker
- [x] confluence
- [x] jira
- [x] openstitching/stitch
- [x] strapi/strapi
- [x] amancevice/superset
- [x] matrixdotorg/synapse
@ -498,6 +502,7 @@ A collection of delicious docker recipes.
- [x] traccar/traccar
- [x] traefik
- [x] trinodb/trino
- [x] ghcr.io/coqui-ai/tts-cpu
- [x] louislam/uptime-kuma
- [x] v2ray/official :cn:
- [x] mpromonet/v4l2rtspserver :camera:

37
piper/Dockerfile Normal file
View File

@ -0,0 +1,37 @@
#
# Dockerfile for piper
#
FROM debian:12
MAINTAINER EasyPi Software Foundation
ARG PIPER_VERSION=2023.11.14-2
ARG PIPER_OS=linux
ARG PIPER_ARCH=x86_64
ARG PIPER_FILE=piper_${PIPER_OS}_${PIPER_ARCH}.tar.gz
ARG PIPER_URL=https://github.com/rhasspy/piper/releases/download/${PIPER_VERSION}/${PIPER_FILE}
ARG MODEL_BASE_URL=https://huggingface.co/rhasspy/piper-voices/resolve/v1.0.0/en/en_US
ARG MODEL_VOICES=amy,arctic,danny,hfc_female,hfc_male,joe,kathleen,kristin,kusal,l2arctic,lessac,libritts,libritts_r,ljspeech,ryan
ARG MODEL_QUALITY=medium
WORKDIR /opt/piper
RUN set -xe \
&& apt update -y \
&& apt install -y curl \
&& curl -sSL ${PIPER_URL} | tar xz --strip 1 \
&& mkdir models \
&& cd models \
&& echo ${MODEL_VOICES} | tr ',' '\n' | while read voice; \
do \
model_url=${MODEL_BASE_URL}/${voice}/${MODEL_QUALITY}/en_US-${voice}-${MODEL_QUALITY}.onnx; \
curl -sSL -O ${model_url} -O ${model_url}.json; \
done \
&& cd .. \
&& ./piper --version \
&& apt remove -y curl \
&& rm -rf /var/lib/apt/lists/*
ENTRYPOINT ["/opt/piper/piper"]
CMD ["-m", "/opt/piper/models/en_US-lessac-medium.onnx", "-d", "/tmp"]

19
piper/README.md Normal file
View File

@ -0,0 +1,19 @@
piper
=====
[piper][1] is a fast, local neural text to speech system that sounds great and is optimized for the Raspberry Pi 4.
```bash
# Create an alias
$ alias piper='docker run -i --rm -u $(id -u):$(id -g) -v $PWD:/tmp vimagick/piper -m /opt/piper/models/en_US-amy-medium.onnx'
# Do text-to-speech
$ echo 'Welcome to the world of speech synthesis!' | piper -f /tmp/welcome.wav
# Play audio
$ play welcome.wav
```
List of voices: https://rhasspy.github.io/piper-samples/
[1]: https://github.com/rhasspy/piper

32
rembg/README.md Normal file
View File

@ -0,0 +1,32 @@
rembg
=====
[Rembg][1] is a tool to remove images background.
## Web Service
```bash
$ docker compose up -d
$ url=https://raw.githubusercontent.com/danielgatis/rembg/master/examples/girl-3.jpg
$ curl -sSL $url -o input.jpg
$ curl -s -G http://localhost:7000/api/remove -d url=$url -o output.png
$ curl -s http://localhost:7000/api/remove -F file=@input.jpg -o output.png
```
## Ad Hoc Commands
```bash
# Create an alias
$ alias rembg='docker run --rm -u $(id -u):$(id -g) -v $PWD:/rembg danielgatis/rembg:2'
# Remove the background from a local file
$ rembg i input.png output.png
# Remove the background returning only the mask
$ rembg i -om input.png output.png
# Remove the background applying an alpha matting
$ rembg i -a input.png output.png
```
[1]: https://github.com/danielgatis/rembg

8
rembg/docker-compose.yml Normal file
View File

@ -0,0 +1,8 @@
version: "3.8"
services:
rembg:
image: danielgatis/rembg:2
command: s --host 0.0.0.0 --port 7000 --log_level info
ports:
- "7000:7000"
restart: unless-stopped

12
stitching/README.md Normal file
View File

@ -0,0 +1,12 @@
stitching
=========
[stitching][1] is a Python package for fast and robust Image Stitching.
```bash
$ alias stitch='docker run --rm -v $PWD:/data openstitching/stitch'
$ stitch *.jpg
```
[1]: https://github.com/OpenStitching/stitching

13
tts/README.md Normal file
View File

@ -0,0 +1,13 @@
TTS
===
[TTS][1] - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
```bash
$ docker compose up -d
$ docker compose exec tts bash
>>> python3 TTS/server/server.py --list_models
>>> exit
```
[1]: https://github.com/coqui-ai/TTS

12
tts/docker-compose.yml Normal file
View File

@ -0,0 +1,12 @@
version: "3.8"
services:
tts:
image: ghcr.io/coqui-ai/tts-cpu:v0.22.0
entrypoint: ["python3"]
command: |
TTS/server/server.py
--model_name tts_models/en/vctk/vits
--extra_model_name tts_models/en/ljspeech/tacotron2-DDC_ph
ports:
- "5002:5002"
restart: unless-stopped

View File

@ -0,0 +1,26 @@
whisper-asr-webservice
======================
[Whisper ASR Webservice][1] is a free transcription service powered by Whisper AI.
It supports following whisper models:
- [openai/whisper](https://github.com/openai/whisper)
- [SYSTRAN/faster-whisper](https://github.com/SYSTRAN/faster-whisper)
## Server
```bash
$ docker compose up -d
$ curl http://127.0.0.1:9000/docs
```
## Client
```bash
$ wget -O audio.wav https://github.com/rhasspy/piper/raw/master/notebooks/wav/en/success.wav
$ curl -F audio_file=@audio.wav "http://127.0.0.1:9000/asr?task=transcribe&output=srt"
$ curl -F audio_file=@audio.wav "http://127.0.0.1:9000/detect-language"
```
[1]: https://github.com/ahmetoner/whisper-asr-webservice

View File

@ -0,0 +1,13 @@
version: "3.8"
services:
asr:
image: onerahmet/openai-whisper-asr-webservice
ports:
- "9000:9000"
volumes:
- ./data:/data
environment:
- ASR_MODEL=medium
- ASR_ENGINE=faster_whisper
- ASR_MODEL_PATH=/data
restart: unless-stopped