Topic: Feature Request (?): Tagging Tool

Posted under General

Backstory
Just the other day, while I was surfin' the interwebs, I came across a danbooru v1 fork of imageboard software called MyImouto. After hours of bashing my head against a wall (since I don't have much skills in server management or web design), I finally got it installed on my Lubuntu 14.04. I don't have many pictures in it yet, but I can't really upload quickly due to the fact that I have to open iqdb.hary.lu, upload the picture I want it to look up on e621, open the result, copy and paste the e621 link into the source on my server, copy and past the tags from e621 into the tags of my site, then upload it. There has to be a way to simplify this.

Request
I was looking through the forums the other day, and I found that some users were able to use UserScripts to modify the way the post page looked. Instead of having to click both "+Favorite" and "+1", all they had to click was "+Favorite", and it'd do both.

I was wondering if there was some kind of UserScript I could deploy that would show a button on screen that would automatically copy the list of tags in the "edit" tags box to my clipboard. I would still have to all the steps above to upload, but this would make it much faster.

Thanks!
Nikolai

Updated

i am confused after reading that backstory and your upload method seems over complicated to me. you could just use google image search and see if the pic your wanting to upload is already here.

Updated by anonymous

treos said:
i am confused after reading that backstory and your upload method seems over complicated to me. you could just use google image search and see if the pic your wanting to upload is already here.

If that is what they want, the site has its own image search engine: Harry.Lu

Just upload an image into there, or copy the download link & paste into the search bar, and it'll search through here. It even searches for previously deleted images.

Updated by anonymous

treos said:
i am confused after reading that backstory and your upload method seems over complicated to me. you could just use google image search and see if the pic your wanting to upload is already here.

I wanted to essentially create an offline e621, and for all intensive purposes I have. I just wanted to be able to click a button that would copy all the tags of a post to my clipboard (or whatever the equivalent is on Ubuntu). I find it time consuming to scroll down, click edit, select all the tags in the tags textfield, copy it, then switch tabs and paste into my own tags textfield.

Updated by anonymous

You might find TMSU interesting.
It would probably be simpler than the process you describe.

Personally, I do not bother copying tags; I wrote a simple tool that, given a list of files,calculates MD5sum of files, searches them on e621, and applies the tags from the resulting post via TMSU.

Anyway, supposing that what you're using is well suited for you, the actual process still seems to have problems. Wouldn't it be more appropriate to add a field into the upload form where you can paste in an e621 post uri, and have the server automatically query e621 and find the tag list?

Updated by anonymous

savageorange said:
You might find TMSU interesting.
It would probably be simpler than the process you describe.

Personally, I do not bother copying tags; I wrote a simple tool that, given a list of files,calculates MD5sum of files, searches them on e621, and applies the tags from the resulting post via TMSU.

Anyway, supposing that what you're using is well suited for you, the actual process still seems to have problems. Wouldn't it be more appropriate to add a field into the upload form where you can paste in an e621 post uri, and have the server automatically query e621 and find the tag list?

Reading up on that TMSU program you use, I liked the idea behind it, but I wanted to keep my pictures in an Imageboard-like environment. It just... feels right.

You said you had a simple tool that you made that compares the md5sum of any files you put through it to the ones here at e621. Is that something I can find on GitHub? I'm very interested.

Updated by anonymous

Nikolaithefur said:

You said you had a simple tool that you made that compares the md5sum of any files you put through it to the ones here at e621. Is that something I can find on GitHub? I'm very interested.

... That part is dead simple.

You can calculate the md5sum of any file using the md5sum program, which is installed by default in almost all linux distros.

Once you have that md5sum, you just query https://e621.net/post/index.json?tags=md5:THEMD5SUM , which gives you a json file with one record per matching post.
Then you just parse the json (there are truckloads of tools to do that) and apply the tags/whatever else you want to do.

----

Here's the code I use for that particular part: (written in Python using requests library)

def getjson(md5sum):
    text = requests.post("https://e621.net/post/index.json", {'tags': 'md5:%s' % md5sum}).text
    try:
        j = json.loads(text)
    except ValueError:
        msg ('returned text %r didn\'t look like JSON' % text)
        raise
    return j

But you can do the whole task much more compactly, for example in 3 commands:

#!/bin/sh
# this is a shell script (bash/zsh/etc), not Python.
# you should set it executable. Then run it, passing the path to the file you want to look up.
if [ ! -e "$1" ]; then
  # bail out if the user didn't specify what file to look at,
  # or they specified a file that doesn't exist.
  echo "Syntax: e6autotag <filename>"
  exit 1
fi
# get the md5sum, ignore the part of the output that specifies filename
M=$(md5sum "$1"|cut -d ' ' -f 1)
# talk to e621, get records in JSON format
wget -O /tmp/md5l.json 'https://e621.net/post/index.json?tags=md5:'$M
# get 1st item in result array, return its 'tags' field
tags=$(jq --raw-output '.[0].tags' /tmp/md5l.json)

(this would require the 'jq' tool to be installed, as well as wget, but I suspect ubuntu has packages for both of those)

Example Result (content of 'tags' field)

2016 animated anus big_penis blue_body blush butt cum cum_in_pussy cum_inside cum_on_body cumshot digital_media_(artwork) duo erection faceless_male female first_person_view goo gooey group group_sex hair handjob happy humanoid humanoid_penis looking_at_viewer looking_pleasured loop low_res lying male male/female monster_girl multiple_female multiple_scenes nude on_back on_top open_mouth orgasm penetration penis pixel_(artwork) pixel_animation pussy reverse_cowgirl_position sb sex sibling simple_background slime smile solo_focus threesome throbbing transformation translucent twins vaginal vaginal_penetration white_background young

(for reference, that's post #843816)

It's also possible to throw that list of tags on the clipboard in the same script -- look up the man page for xclip (may need to install it first).

----

I bpasted the entirety of -my- e6autotag program here , but I think you'll be less interested in that, as a) it does a few other things, like ignoring not-already-known tags, handling locally defined aliases, and checking favcount; b) it has extra dependencies (requests, and tmsoup)

Updated by anonymous

Nikolaithefur said:
I wanted to essentially create an offline e621, and for all intensive purposes I have. I just wanted to be able to click a button that would copy all the tags of a post to my clipboard (or whatever the equivalent is on Ubuntu). I find it time consuming to scroll down, click edit, select all the tags in the tags textfield, copy it, then switch tabs and paste into my own tags text field.

ah, ok

Updated by anonymous

savageorange said:
... That part is dead simple.

You can calculate the md5sum of any file using the md5sum program, which is installed by default in almost all linux distros.

Once you have that md5sum, you just query https://e621.net/post/index.json?tags=md5:THEMD5SUM , which gives you a json file with one record per matching post.
Then you just parse the json (there are truckloads of tools to do that) and apply the tags/whatever else you want to do.

----

Here's the code I use for that particular part: (written in Python using requests library)

def getjson(md5sum):
    text = requests.post("https://e621.net/post/index.json", {'tags': 'md5:%s' % md5sum}).text
    try:
        j = json.loads(text)
    except ValueError:
        msg ('returned text %r didn\'t look like JSON' % text)
        raise
    return j

But you can do the whole task much more compactly, for example in 3 commands:

#!/bin/sh
# this is a shell script (bash/zsh/etc), not Python.
# you should set it executable. Then run it, passing the path to the file you want to look up.
if [ ! -e "$1" ]; then
  # bail out if the user didn't specify what file to look at,
  # or they specified a file that doesn't exist.
  echo "Syntax: e6autotag <filename>"
  exit 1
fi
# get the md5sum, ignore the part of the output that specifies filename
M=$(md5sum "$1"|cut -d ' ' -f 1)
# talk to e621, get records in JSON format
wget -O /tmp/md5l.json 'https://e621.net/post/index.json?tags=md5:'$M
# get 1st item in result array, return its 'tags' field
tags=$(jq --raw-output '.[0].tags' /tmp/md5l.json)

(this would require the 'jq' tool to be installed, as well as wget, but I suspect ubuntu has packages for both of those)

Example Result (content of 'tags' field)

2016 animated anus big_penis blue_body blush butt cum cum_in_pussy cum_inside cum_on_body cumshot digital_media_(artwork) duo erection faceless_male female first_person_view goo gooey group group_sex hair handjob happy humanoid humanoid_penis looking_at_viewer looking_pleasured loop low_res lying male male/female monster_girl multiple_female multiple_scenes nude on_back on_top open_mouth orgasm penetration penis pixel_(artwork) pixel_animation pussy reverse_cowgirl_position sb sex sibling simple_background slime smile solo_focus threesome throbbing transformation translucent twins vaginal vaginal_penetration white_background young

(for reference, that's post #843816)

It's also possible to throw that list of tags on the clipboard in the same script -- look up the man page for xclip (may need to install it first).

----

I bpasted the entirety of -my- e6autotag program here , but I think you'll be less interested in that, as a) it does a few other things, like ignoring not-already-known tags, handling locally defined aliases, and checking favcount; b) it has extra dependencies (requests, and tmsoup)

So it works, and that's awesome! Now I can do most of my image searches locally.
However, when looking in ~/md5l.json (where I reassigned it for ease), it includes a bunch of other info that I don't need... Is there a command I can use to filter out anything not related directly to tags?

Updated by anonymous

I put the data in '/tmp/' because it's -temporary- data (the system will automatically wipe it out at reboot time, which is not true for things you put in your home dir)

If you really want to put it inside your home dir, I'd suggest putting it inside ~/.cache/ (which has similar 'nonpermanent storage' implications).

Or you could just have the script promptly remove it using rm, if you don't want to inspect it.

The jq command I wrote above does the exact filtering you're talking about, you just have to redirect its output stream to whatever file you want.

An example of how to do that (based on a compact formulation of what I wrote in my previous post)

filename="$1"; wget -O - 'https://e621.net/post/index.json?tags=md5:'$(md5sum "$filename" | cut -d ' ' -f 1) | jq --raw-output '.[0].tags' - > /tmp/e6tags
  • This just streams the output of each program into the input of the next, and finally sends jq's output (the list of tags) to /tmp/e6tags.
  • - is a special notation used to specify 'read input from standard input [or write output to standard output], rather than a named file', where the program would otherwise be expecting a normal filename.
  • shell scripting, especially when written for compactness, is cryptic, sorry. Not too bad a tradeoff for Phenomenal Cosmic Power, though ;)

Updated by anonymous

savageorange said:
I put the data in '/tmp/' because it's -temporary- data (the system will automatically wipe it out at reboot time, which is not true for things you put in your home dir)

If you really want to put it inside your home dir, I'd suggest putting it inside ~/.cache/ (which has similar 'nonpermanent storage' implications).

The jq command does the exact filtering you're talking about, you just have to redirect its output stream to whatever file you want.
(but be aware that using the same filename for input and output doesn't work exactly how you think it might. It's better to use different ones.)

An example of how to do that (based on a compact formulation of what I wrote in my previous post)

filename="$1"; wget -O - 'https://e621.net/post/index.json?tags=md5:'$(md5sum "$filename" | cut -d ' ' -f 1) | jq --raw-output '.[0].tags' - > /tmp/e6tags
  • This just streams the output of each program into the input of the next, and finally sends jq's output (the list of tags) to /tmp/e6tags.
  • - is a special notation used to specify 'read input from standard input [or write output to standard output], rather than a named file', where the program would otherwise be expecting a normal filename.
  • shell scripting, especially when written for compactness, is cryptic, sorry.

You know you've been on GitHub for too long when you start commenting something, then frantically search for the "Comment and Close" button.

That works great! That makes all my tagging process 10 times easier. Thank you!

Updated by anonymous

Now to look for some code that would give this awesome code a GUI.

Updated by anonymous

If you just want to let the user interactively browse for a file to fetch the tags of, zenity will do it (assuming, of course, that it's installed)

If you want to hook it up to the web GUI of MyImouto so tags field is automatically filled, I can't comment much on that. But there are other people here who do know about Javascript, which would probably be the 'nicest' way of handling it.

Updated by anonymous

savageorange said:
If you just want to let the user interactively browse for a file to fetch the tags of, zenity will do it (assuming, of course, that it's installed)

Cyka blyat...

Okay, hopefully you'll be able to make since of this, because I'm almost braindead at this point...

Here's the code I used to get the zenity dialog box open:
zenity --file-selection="$filename"; filename="$1"; echo 'root' | sudo -S wget -O - 'https://e621.net/post/index.json?tags=md5:'$(md5sum "$filename" | cut -d ' ' -f 1) | jq --raw-output '.[0].tags' - > /var/www/myimouto/public/e6tags.txt

Here's the output:

root@linux:/media/ha-gay/6F49B10D115AF445/Yiff# sh op.sh
/media/ha-gay/6F49B10D115AF445/Yiff/05394a00cab1bf8f4e008a605fc2bc1f.png
md5sum: : No such file or directory
--2016-07-18 02:29:45-- https://e621.net/post/index.json?tags=md5:
Resolving e621.net (e621.net)... 2400:cb00:2048:1::6819:7717, 2400:cb00:2048:1::6819:7617, 104.25.119.23, ...
Connecting to e621.net (e621.net)|2400:cb00:2048:1::6819:7717|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2 [application/json]
Saving to: ‘STDOUT’

I mean, you've already helped me out a lot as it is (thank you btw), but I'm not that programmer-friendly...

Updated by anonymous

  • Sure. You seem to be willing to learn, which is the most important bit about programming. I wouldn't be discussing this with you if I thought you would treat it as a total black box.
  • "$1" represents the first commandline argument passed to a script. if you didn't pass an argument to it, then "$1" will expand to nothing at all.
  • which is what you actually did: sh op.sh -- $0 (the name of the program) will equal 'op.sh', and $1 will equal.. nothing (an empty string).
  • the message log you posted is because of this. You are setting filename to an empty value, which results in trying to calculate the md5sum of a file named ''; the rest of what happens can be reasoned out from that.
  • The --file-selection option does not accept a parameter (as specified in the man page for zenity). Trying to pass one appears to be harmless (no effect), though
  • You probably want to store the output of zenity in a variable, using $(process substitution), eg. filename=$(zenity --file-selection). This sets the value of filename to be whatever zenity outputs (in this case, the filename that was selected by the user.)

Other issues:

  • sudoing wget has no effect in this case. wget isn't writing a file, it's writing to stdout. It is the shell itself which is writing a file, at the end of that pipeline of commands. That's what you're saying by > /var/www/myimouto/public/e6tags.txt; the shell is what needs to be able to write to that file. (the only reason you didn't get an error about it appears to be that you were ALREADY root)

* in this case, the solution is fairly straightforward : remove the sudo stuff from op.sh, and sudo sh itself, as in sudo sh op.sh
* A more 'correct' solution would probably involve setting up the permissions so that that file -- and ONLY that file -- is writable by (your normal login user), so sudo doesn't need to be involved at all. Actually doing so is more than I'm willing to try to explain on the internet though.

  • I sincerely hope that is not your actual root password. Regardless of whether it is or isn't, storing passwords in a script is almost never a wise move.

Updated by anonymous

savageorange said:

  • Sure. You seem to be willing to learn, which is the most important bit about programming. I wouldn't be discussing this with you if I thought you would treat it as a total black box.
  • "$1" represents the first commandline argument passed to a script. if you didn't pass an argument to it, then "$1" will expand to nothing at all.
  • which is what you actually did: sh op.sh -- $0 (the name of the program) will equal 'op.sh', and $1 will equal.. nothing (an empty string).
  • the message log you posted is because of this. You are setting filename to an empty value, which results in trying to calculate the md5sum of a file named ''; the rest of what happens can be reasoned out from that.
  • The --file-selection option does not accept a parameter (as specified in the man page for zenity). Trying to pass one appears to be harmless (no effect), though
  • You probably want to store the output of zenity in a variable, using $(process substitution), eg. filename=$(zenity --file-selection). This sets the value of filename to be whatever zenity outputs (in this case, the filename that was selected by the user.)

Other issues:

  • sudoing wget has no effect in this case. wget isn't writing a file, it's writing to stdout. It is the shell itself which is writing a file, at the end of that pipeline of commands. That's what you're saying by > /var/www/myimouto/public/e6tags.txt; the shell is what needs to be able to write to that file. (the only reason you didn't get an error about it appears to be that you were ALREADY root)

* in this case, the solution is fairly straightforward : remove the sudo stuff from op.sh, and sudo sh itself, as in sudo sh op.sh
* A more 'correct' solution would probably involve setting up the permissions so that that file -- and ONLY that file -- is writable by (your normal login user), so sudo doesn't need to be involved at all. Actually doing so is more than I'm willing to try to explain on the internet though.

  • I sincerely hope that is not your actual root password. Regardless of whether it is or isn't, storing passwords in a script is almost never a wise move.

Setting the filename variable equal to zenity --file-selector worked! Thank you!

I also removed the sudo command from the script, and now I'm just running the sh command with sudo (sudo sh oper.sh)

Updated by anonymous