OwOCR Setup Guide

A simple guide on how to quickly set up and use OwOCR, an optical character recognition program that can make reading Japanese text from visual novels much easier.

Note: OwOCR is now available as a standalone GUI application, but it is still available as a terminal tool via pip on all platforms. Alternatively, you can use GameSentenceMiner, which includes a fork of OwOCR as part of its all-in-one mining setup. Keep in mind that due to the GSM fork using an older version of OwOCR, there can be differences in the results and new features such as text reordering and furigana filtering may not be present.

OwOCR in action

Installation

  • Download the binary from the releases page
  • On Windows, extract the zip and double-click on owocr.exe
  • On macOS, open the dmg, drag it to Applications, then grant permissions on first launch
  • Once you open the application, a log viewer and a tray application will appear, allowing you to easily change the configuration and monitor the application
OwOCR Log Viewer

Terminal / Python (All Platforms)

  1. Install Python (3.11, 3.12, or 3.13)

    Warning: During installation, make sure to select "Add to Path". If you skip this step, OwOCR will not work.

  2. Open your command prompt/terminal and run:
    pip install owocr
    
  3. Install Google Lens for best OCR accuracy:
    pip install "owocr[lens]"
    

Other providers:

  • Bing: pre-installed → key: b (close second best, recommended)
  • Apple Live Text: pre-installed → key: d (macOS only, best local engine on Mac, vertical text support on Sonoma+)
  • Apple Vision: pre-installed → key: a (macOS only, older version of Live Text)
  • OneOCR: pip install "owocr[oneocr]" → key: z (Windows 10/11, fast and accurate)
  • Manga OCR: pip install "owocr[mangaocr]" → key: m (ideal for small text areas)
  • EasyOCR: pip install "owocr[easyocr]" → key: e
  • RapidOCR: pip install "owocr[rapidocr]" → key: r
  • WinRT OCR: pip install "owocr[winocr]" → key: w (Windows 10/11)
  • meikiocr: pip install "owocr[meikiocr]" → key: k (fast local option, best for Linux, supports GPU acceleration with onnxruntime-gpu. Cannot process vertical text)

From the GitHub: If using an online engine like Lens, the developer recommends setting a secondary local engine with -es: -es=oneocr on Windows, -es=alivetext on macOS, -es=meikiocr on Linux.

Getting Started

First, open OwOCR, and you'll see a log viewer window and tray icon appear. To get started with some basic configuration, right click on the tray icon and access the menu. From here you can change settings like the OCR engine and more.

OwOCR tray menu

Click Configure from the tray menu to open the settings. Here are some suggested settings for use with visual novels:

Under the General tab:

  • read_from: screencapture
  • write_to: websocket (see the Textractor Guide if you need help setting up a texthooker page for WebSocket output)

Under the Engines tab:

  • engine: Google Lens for best accuracy or oneocr on Windows for faster local processing, pick whichever suits your needs
  • engine_secondary: oneocr on Windows, alivetext on macOS, meikiocr on Linux

The local engines listed above are also perfectly valid as your primary engine if you prefer faster processing and privacy. Only enable the engine(s) you actually use, as extra engines significantly increase startup time.

Under the Screen capture tab:

  • screen_capture_area controls what OwOCR captures. By default it's set to automatic selection, which prompts you to select a capture area on startup. You can also set it to capture the entire screen, or set it to window, which will let you enter a window name to lock onto.
OwOCR Configuration Editor

You can also quickly switch between engines at any time from the tray menu under Change engine.

To start and stop OCR, left click on the tray icon (Windows) or select Pause/Unpause from the tray menu.

Terminal

Tip: You can run owocr_config from the terminal to open the same configuration editor shown above.

Screen Capture Mode

This is the easiest way to get started. OwOCR captures directly from your screen with a hotkey.

owocr -r=screencapture --engine=glens

This starts OwOCR with Google Lens and opens an area selector when triggered. The recognized text is copied to your clipboard for dictionary lookup.

Capture a specific window:

owocr -r=screencapture -sa=YourWindowTitle --engine=glens

Replace YourWindowTitle with part of your game's window title. OwOCR will capture the first window that matches.

Capture just part of a window:

owocr -r=screencapture -sa=YourWindowTitle -swa="" --engine=glens

Adding -swa="" opens a selector to choose a specific area within that window. Useful when you only want to OCR the text box.

Create a batch file (e.g., run_owocr.bat) for easy one-click launching. This script lets you choose between screen capture and clipboard mode, pick your output destination, and optionally target a specific window from a list of open windows.

Warning: Don't name the file owocr.bat or it will call itself instead of the actual program.

@echo off
setlocal enabledelayedexpansion
set "OWOCR_CMD=owocr"
set "WEBSOCKET_PORT=6677"

echo === OwOCR Launcher ===
echo.
echo Choose input mode:
echo   1: Screen capture (auto-capture window/region)
echo   2: Clipboard (manual screenshots)
echo.
set /p "input_mode=Enter 1 or 2: "

echo.
echo Choose output mode:
echo   1: Clipboard (default)
echo   2: Websocket (for Agent/other tools)
echo.
set /p "output_mode=Enter 1 or 2: "

if "%output_mode%"=="2" (
    set "OUTPUT_ARGS=-w websocket -wp %WEBSOCKET_PORT%"
) else (
    set "OUTPUT_ARGS=-w clipboard"
)

if "%input_mode%"=="2" (
    echo.
    echo Launching in clipboard mode...
    %OWOCR_CMD% -r clipboard --engine glens -es oneocr %OUTPUT_ARGS%
    goto end
)

:: Screen capture mode
echo.
choice /C YN /M "Target a specific window"
if errorlevel 2 goto fullscreen

echo.
echo Available windows:
echo ---------------------------------
set i=0
for /f "delims=" %%A in ('powershell -NoProfile -Command ^
    "Get-Process | Where-Object { $_.MainWindowTitle -and $_.MainWindowTitle.Trim() } | ForEach-Object { $_.MainWindowTitle } | Sort-Object -Unique"') do (
    set /a i+=1
    set "win[!i!]=%%A"
    echo !i!: %%A
)
echo ---------------------------------

if %i%==0 (
    echo No windows found.
    goto fullscreen
)

set /p "choice=Enter number: "
if not defined win[%choice%] (
    echo Invalid selection.
    goto fullscreen
)

set "WINDOW=!win[%choice%]!"
echo.
echo Full title: !WINDOW!

:: Extract safe search term
for /f "delims=" %%S in ('powershell -NoProfile -Command ^
    "$t = '!WINDOW!'; $m = [regex]::Match($t, '^[^/:()]+'); if($m.Success -and $m.Value.Trim().Length -gt 3){$m.Value.Trim()}else{$t.Substring(0,[Math]::Min(20,$t.Length))}"') do set "SAFE_WINDOW=%%S"

echo Using search term: !SAFE_WINDOW!
echo.

echo Choose OCR mode:
echo   1: Pick region within window (recommended)
echo   2: Capture whole window
echo.
set /p "mode=Enter 1 or 2: "

if "%mode%"=="1" (
    echo.
    echo Launching with region picker...
    %OWOCR_CMD% -r screencapture --engine glens -es oneocr %OUTPUT_ARGS% -sa "!SAFE_WINDOW!" -swa ""
) else (
    echo.
    echo Launching for whole window capture...
    %OWOCR_CMD% -r screencapture --engine glens -es oneocr %OUTPUT_ARGS% -sa "!SAFE_WINDOW!" -swa "window"
)
goto end

:fullscreen
echo.
echo Starting full-screen mode...
%OWOCR_CMD% -r screencapture --engine glens -es oneocr %OUTPUT_ARGS%

:end
echo.
echo OwOCR exited with code: %ERRORLEVEL%
pause

Clipboard Mode

If you prefer using a screenshot tool like ShareX, OwOCR can monitor your clipboard instead:

owocr --engine=glens

Take a screenshot that copies to clipboard, and OwOCR will automatically process it and replace the clipboard contents with the recognized text.

ShareX Configuration (Optional)

If you prefer clipboard mode with ShareX instead of OwOCR's built-in screen capture:

  1. Right-click the ShareX system tray icon → "Hotkey settings"
  2. Click "Add" to create a new hotkey
  3. Select Capture Region or Capture Region (Light)
  4. Set your preferred hotkey (avoid keys games use, like Shift)
ShareX config
  1. Click the gear icon next to your hotkey
  2. Under "Task" settings, enable "Override after capture tasks"
  3. Select only "Copy image to clipboard" and disable everything else
ShareX Hotkey Settings

Dictionary Integration

JL

JL is the recommended dictionary for OwOCR. It works with both clipboard and WebSocket input.

For the ShareX workflow (OCRing specific words for instant lookup), enable these settings:

  • Preferences → Popup → Auto lookup the first term when it's copied from the clipboard
  • Preferences → Popup → Auto lookup the first term when it's copied from a WebSocket
  • Preferences → Popup → Don't auto look up the first term on text change if Main Window is not minimized

These settings let you OCR text and immediately see the definition without any extra clicks.

For a more seamless experience, consider using Tsukikage, which sends text to JL only when you hover over the corresponding region on your screen.

Yomitan

Enable clipboard monitor in Yomitan, then click the magnifying glass to access the search page.

Search page

Browser Extensions

For texthooker page integration:

WebSocket Support

OwOCR can send text via WebSocket instead of clipboard:

  • Use -w=websocket to write text to WebSocket
  • Default port is 7331 (configurable in config file)

Runtime Controls

While OwOCR is running, you can use these keys:

KeyAction
lSwitch to Google Lens
bSwitch to Bing
zSwitch to OneOCR
mSwitch to Manga OCR
nSwitch to Manga OCR (segmented)
kSwitch to meikiocr
wSwitch to WinRT OCR
sCycle through available engines
pPause/resume OCR
t or qQuit the program

Command Reference

Input/Output Options

OptionDescription
-r, --read_fromWhere to read images from: clipboard, websocket, unixsocket, screencapture, or a directory path
-rs, --read_from_secondaryOptional secondary input source
-w, --write_toWhere to save text: clipboard, websocket, or a file path
-e, --engineOCR engine: glens, bing, mangaocr, mangaocrs, gvision, avision, alivetext, azure, winrtocr, oneocr, easyocr, rapidocr, ocrspace, meikiocr
-es, --engine_secondarySecondary local engine for two-pass processing

Screen Capture Options

OptionDescription
-sa, --screen_capture_areaTarget area: empty (selector), coordinates (x1,y1,x2,y2), screen_N, or window name
-swa, --screen_capture_window_areaSubsection of window: empty (selector), coordinates, or window for whole window
-sd, --screen_capture_delay_secsDelay between screenshots. -1 to disable periodic capture
-sw, --screen_capture_only_active_windowsOnly capture when target window is active
-sf, --screen_capture_frame_stabilizationWait for stable text before processing. -1 waits for matching results, 0 disables
-sl, --screen_capture_line_recoveryRecover missed lines from unstable frames
-sr, --screen_capture_regex_filterRegex to filter unwanted text (e.g., ▶|♥|・)
-sc, --screen_capture_comboHotkey combo for taking screenshots (e.g., <ctrl>+<shift>+s)
-scc, --coordinate_selector_comboHotkey combo for changing capture area

Text Processing Options

OptionDescription
-l, --languageLanguage code: ja (Japanese), zh (Chinese), ko (Korean), etc.
-j, --join_linesRemove spaces/separators between lines
-jp, --join_paragraphsRemove spaces/separators between paragraphs
-ls, --line_separatorCustom line separator (supports \n)
-ps, --paragraph_separatorCustom paragraph separator
-rt, --reorder_textRegroup and reorder text instead of using OCR engine order
-f, --furigana_filterFilter furigana from Japanese text (enabled by default for Japanese)
-of, --output_formatOutput format: text (default) or json (includes coordinates)

Other Options

OptionDescription
-p, --pause_at_startupStart paused
-d, --delete_imagesDelete images after processing (when reading from directory)
-n, --notificationsShow OS notification with detected text
-a, --auto_pauseAuto-pause after N seconds of no recognition. 0 to disable
-cp, --combo_pauseHotkey combo for pause (e.g., <ctrl>+<shift>+p)
-cs, --combo_engine_switchHotkey combo for switching engines
-wp, --websocket_portWebSocket port (default: 7331)
-ds, --delay_secondsDelay between clipboard/directory checks
-v, --verbosityOutput verbosity: -2 (full text), -1 (timestamps only), 0 (errors only), or character limit

Tips

  • Google Lens offers the best accuracy in most cases. Local engines like OneOCR (Windows), Apple Live Text (macOS), and meikiocr (Linux) often work faster and respect your privacy
  • Secondary engine: You can set up a secondary engine if you want to make use of a two-pass system. This means that only the changed areas get sent to the online engine, improving speed and accuracy
  • Furigana is automatically filtered for Japanese text
  • Avoid hotkey conflicts with game controls
  • Mouse buttons work great as hotkeys if your mouse has extra buttons
  • Test your setup on some Japanese text before starting a VN
  • Config: Navigate to tray icon → Configure, or run owocr_config. You can also edit it directly at C:\Users\YourUsername\.config\owocr_config.ini (Linux/macOS: ~/.config/owocr_config.ini)

For more details, see the OwOCR GitHub page.

Created
Last updated
Edit on GitHub