[Samba] truecrypt on synology as subfolder

Discussion:

Xen

2016-04-29 22:26:35 UTC

Just a small question.

==============
I was really just meaning to ask whether there are issues with running
truecrypt inside a samba share.

My system had never had any issues but now I was writing to a folder
that is a truecrypt mount inside a samba share and the smbd process
writing into it blocked or is still "D" waiting for IO and I cannot kill
it.

So the question was whether there are issues with running truecrypt
inside a tree being served by smbd, but I now know that for a fact there
are :p.

Still, if you have anything to say on it, please do so.

Copying a file to my mounted samba share did not work, the files are
created by they remain at 0 size. Then apparently smbd blocks in IO and
never continues, never recovers. I cannot kill the process, I will have
to reboot the NAS but it won't unmount filesystems so I need a hard
reset.

Regards

==============

On Synology DiskStation NASses there is typically only file-level
encryption available via eCryptFS.

I had once seen the note that NFS onto an ecryptfs share wouldn't work.

Now SAMBA onto ecryptfs does work.

I just managed to get TrueCrypt compiled and reasonably working.

I mounted a container in a subdirectory of my home dire that I have
mounted from a linux box.

But writes to this subdirectory fail over the samba mount.

The files are created but retain a size of zero.

All of a sudden a lot of stuff hangs on this computer. I cannot do ls /,
or df, and I think I am going to try to dismount that share now.

Well this is fun, never had anything like this before:

malloc: unknown:0: assertion botched
free: called with unallocated block argument
last command: ls
Aborting...

Warning: Program '/bin/bash' crashed.

This was trying to tab-complete a directory in /

This is in my syslog. It was an hour and a half ago and I assume it was
caused by, or related to, writing to that truecrypt mount.:

Apr 29 18:58:43 kubuntu kernel: [280345.596101] CIFS VFS: sends on sock
ffff88012a605400 stuck for 15 seconds
Apr 29 18:58:43 kubuntu kernel: [280345.596124] CIFS VFS: Error -11
sending data on socket to server
Apr 29 18:58:58 kubuntu kernel: [280360.612052] CIFS VFS: sends on sock
ffff88012a605400 stuck for 15 seconds
Apr 29 18:58:58 kubuntu kernel: [280360.612067] CIFS VFS: Error -11
sending data on socket to server
Apr 29 19:13:03 kubuntu kernel: [281205.857294] CIFS VFS: No writable
handles for inode

After that:

Apr 29 19:27:03 kubuntu kernel: [282046.009506] CIFS VFS: No writable
handles for inode
Apr 29 19:29:03 kubuntu kernel: [282166.033596] CIFS VFS: No writable
handles for inode
Apr 29 19:41:03 kubuntu kernel: [282886.161769] CIFS VFS: No writable
handles for inode
Apr 29 19:43:09 kubuntu kernel: [283012.170066] CIFS VFS: Server
diskstation has not responded in 120 seconds. Reconnecting...
Apr 29 19:45:16 kubuntu kernel: [283138.298055] CIFS VFS: Server
diskstation has not responded in 120 seconds. Reconnecting...
Apr 29 19:47:22 kubuntu kernel: [283264.414066] CIFS VFS: Server
diskstation has not responded in 120 seconds. Reconnecting...

I restarted the samba on the server and now my "ls /" still has trouble:

ls: cannot access '/nasdev': Host is down
ls: cannot access '/nashome': Host is down

There are a lot of smbd instances running on the server and they are all
running under my user.

I cannot kill them and their ps flags are all D: uninterruptible sleep
(usually IO)

I can unmount shares now but remounting fails:

mount error(112): Host is down

Stopping smbd on the server goes:

stop smbd ... no pid file, but process exist
stop winbindd ... not running
stop nmbd ... not running
smbd still running, wait 10 sec, force kill

Starting doesn't work anymore, it lists all the existing processes:

start nmbd ... ok
start smbd ...
smbd is running: 31035 30984 2817 2712 2609 2493 2396 2254 2150 2048
1943 1841 1704 1605 1504 1404 1263 1163 1063 966 864 720 621 524

There are no file handles open on the temp directory at which the
truecrypt is mounted:

# lsof mtest
<nothing>

But cannot umount

umount: can't umount /volume1/media/mtest: Device or resource busy

cannot remove the device mapper thing either of course:

# dmsetup remove truecrypt1
device-mapper: remove ioctl failed: Device or resource busy
Command failed

Basically there are probably still smbd processes preventing that dir
form being umounted.

I do not have the error anymore that cp gave when I tried to cp
something to the directory over smb.

But it said something with "fwrite".

I probably won't be able to reboot the NAS, I will have to hard reset
it.

NOW I am finding it:

smbd 30984 xen 28uW REG 253,0 47316992 13
/volume1/media/mtest/.xaa.q80ZIO

xaa was a 48MB file I was trying to write to the container over smb.

trying to rm the file blocks the process (shell).

Now I cannot kill rm.

I think Synology has protected as against attaching a debugger so I also
cannot (try to) close that way.

--
To unsubscribe from this list go to the following URL and read the
instructions: https://lists.samba.org/mailman/options/samba

Xen

2016-07-01 10:18:11 UTC

Permalink

Post by Xen
Copying a file to my mounted samba share did not work, the files are
created by they remain at 0 size. Then apparently smbd blocks in IO
and never continues, never recovers. I cannot kill the process, I
will have to reboot the NAS but it won't unmount filesystems so I
need a hard reset.

That sounds like a kernel bug. Samba only uses normal
open/pread/pwrite/
close calls which should work on top of truecrypt as far as I know.

Before I had issues on some device using TrueCrypt because the entire
thing would hang (mostly SMBD, then) when I read or actually wrote files
to a truecrypt volume mounted inside of another filesystem, so that smbd
would have to cross a filesystem boundary.

I never tested what would happen if the share itself was in the mounted
crypt, from the beginning.

I now compiled LUKS on the device and I get an even stranger error in a
certain sense.

From the "top level" samba share, the subdirectory that has LUKS
mounted, doesn't show any files.

However, when I create another share directly on the mounted LUKS
subfolder, the files are there.

So the transition from regular filespace to LUKS filespaces (different
volume) is not getting this LUKS filespace to appear empty to Samba, but
not to the linux system itself. Is this to be expected?

Is that normal operation, this?

Regards.

ps. it is not a great problem because the intent is probably going to be
to share an entire crypt container anyway, so the problem wouldn't
surface, I just wonder whether it is normal.

--
To unsubscribe from this list go to the following URL and read the
instructions: https://lists.samba.org/mailman/options/samba

Xen

2016-07-02 15:01:45 UTC

Permalink

Post by Xen
So the transition from regular filespace to LUKS filespaces
(different volume) is not getting this LUKS filespace to appear
empty to Samba, but not to the linux system itself. Is this to be
expected?
Is that normal operation, this?

Hard to tell without logs I'm afraid. We'd need much more data
to tell, but the honest answer is probably "don't do that" :-).

I have no other smbd logs other than output that says that some
parameter (wide symlinks) is unknown. There is nothing else in there.

The previous error (with Truecrypt) was found in some logs though.

Basically umount is trying to kill the process, but it is unkillable:

Apr 30 00:35:50 diskstation umount: Kill the process
"/usr/syno/sbin/smbd" with /store/home/xen.
Apr 30 00:35:50 diskstation umount: Kill the process
"/usr/syno/sbin/smbd" with /store/home/xen.
Apr 30 00:35:50 diskstation umount: Kill the process
"/usr/syno/sbin/smbd" with /store/home/xen.
Apr 30 00:35:50 diskstation umount: Kill the process
"/usr/syno/sbin/smbd" with /store/home/xen.
Apr 30 00:35:50 diskstation umount: Kill the process
"/usr/syno/sbin/smbd" with /store/home/xen.
Apr 30 00:35:50 diskstation umount: Kill the process
"/usr/syno/sbin/smbd" with /store/home/xen.
Apr 30 00:35:50 diskstation umount: Kill the process
"/usr/syno/sbin/smbd" with /store/home/xen.
Apr 30 00:35:50 diskstation umount: Kill the process
"/usr/syno/sbin/smbd" with /store/home/xen.

The unkillable machine ;-).

Anyway,

this makes it troublesome because I want only a small area of my share
to be encrypted ;-).

The thing is both data storage (personal files) and a workspace (for
when I log into the machine). So basically I need "two"
homedirectories.....

This is annoying and compicates matters. I have found a solution but now
my system hangs.

The most annoying thing about samba (and applies to NFS as well) as a
client, is that when there is some network error; the entire system may
hang as all reads of the root filesystem (or the / directory) can block
until that network mount thing is resolved, practically rendering your
entire system frozen.

Is there not a solution for that?

--
To unsubscribe from this list go to the following URL and read the
instructions: https://lists.samba.org/mailman/options/samba

Reindl Harald

2016-07-02 15:07:40 UTC

Permalink

Post by Xen
The most annoying thing about samba (and applies to NFS as well) as a
client, is that when there is some network error; the entire system may
hang as all reads of the root filesystem (or the / directory) can block
until that network mount thing is resolved, practically rendering your
entire system frozen.
Is there not a solution for that?

that's hardly something which can be changed in the application layer
and you have similar problems when write large data to a slow
block-device connected with USB

Xen

2016-07-02 15:24:23 UTC

Permalink

Post by Xen
The most annoying thing about samba (and applies to NFS as well) as a
client, is that when there is some network error; the entire system
may hang as all reads of the root filesystem (or the / directory) can
block until that network mount thing is resolved, practically
rendering your entire system frozen.
Is there not a solution for that?

I've had to hard-reset my machine.

It is like git reset --hard but not very functional.

Sometimes mount will cause a freeze of the / directory with remote
(network) filesystems.

And sometimes you can only recover by hard-resetting the machine.

And then upon reboot, you try the same thing, and no issue at all.

(Thank you Harald)

I mean, maybe it has nothing to do with NFS or SMBD, but....

I run into this regularly and as such, have regular (perhaps not as
often, depends on how much you need to do it) complete system freezes.

Post by Xen
that's hardly something which can be changed in the application layer
and you have similar problems when write large data to a slow
block-device connected with USB

So do you have any info on whether someone else could fix it? I mean,
there must be kernel people that know about it right. Is there a way to
file the issue with some bug tracker or something? Somewhere people will
notice?

The CIFS mount hangs for some reason and the entire system goes down
with it.

And every process that needs to read the / directory, hangs.

And that's your Linux stability, then :S.

This can also happen on a server that has an NFS mount.

I suppose the reading of / fails because the stat on the mountpoint
fails (blocks).

So it's probably not only going to be a hanging mount(), but also a
subsequent hanging stat().

Serious issue for me.

--
To unsubscribe from this list go to the following URL and read the
instructions: https://lists.samba.org/mailman/options/samba

Reindl Harald

2016-07-02 15:34:45 UTC

Permalink

Post by Xen

Post by Reindl Harald
that's hardly something which can be changed in the application layer
and you have similar problems when write large data to a slow
block-device connected with USB

kernel

people are surely aware but it's not all that easy to fix as it appears
to be - example: https://lwn.net/Articles/572911/

Xen

2016-07-04 02:00:50 UTC

Permalink

Post by Xen

Post by Reindl Harald
that's hardly something which can be changed in the application layer
and you have similar problems when write large data to a slow
block-device connected with USB

kernel
people are surely aware but it's not all that easy to fix as it
appears to be - example: https://lwn.net/Articles/572911/

I will bet my favourite hat and all the money in the world that the
thing would actually be quite easy to fix if people actually implemented
something sane such as per-device (dirty) cache buffers.

It seems there is a system-wide cache that doesn't take account of
anything and is just one big pool.

And it seems the problem is not getting solved because no one can agree
on a solution, not because it would be hard to do so. As several people
in the thread have said: "perfect is the enemy of good".

I mean this is just ridiculous. That is like not being able to come up
with a solution to having a light circuit with 2 switches. Or any kind
of engineering problem that has long since been solved. The appearance
seems to be that people prefer systems totally locking up, over some
design decision that might cause USB disks to see a 10% drop in
performance.

I was running a system off USB 2 stick. Although the system boots up in
a reasonable time, many IO operations block incessantly for no apparent
reason at all. How the system is capable of booting up, and of being
installed, on that stick, without much issue, but saving a file in
LibreOffice (not even on the stick itself) causes the system to hang for
20 seconds!!!!?.

You really slam your head in despair. Completely incomprehensible how a
system can operate like that.

But right now the situation is different. It appears to have nothing to
do with IO queues, in that sense.

--
To unsubscribe from this list go to the following URL and read the
instructions: https://lists.samba.org/mailman/options/samba

Xen

2016-07-04 02:17:59 UTC

Permalink

I mentioned that when I had mounted TrueCrypt server-side on some
subdirectory of a Samba share, my local system (client) would see the
mount point "lock up".

Today it happened again, I don't know why. I had been toying with LUKS
(on the server) but after a reboot (of the server) the thing was not
even mounted and also not mounted client-side.

Please see this paste:

http://pastebin.com/bnKpmXXv

It contains a call trace for "gvfsd-recent" and several "task pool"s. I
know that "lookup_fast" is the function that either gets something from
the dentry cache, or otherwise tries to ask the actual filesystem (or
was this lookup_slow?) -- but apart from that I don't know what it says.

Someone with enough knowledge should be able to see what goes on.

Processes waiting for that IO will enter a "D" state in the "ps"
overview. D means "uninterruptable sleep". The program "Kate" locked up
first, that I noticed. I could no longer start Kate. Later, every access
for the mount point (or the root directory) caused the lockup.

These mount points are in "/nas". Meaning, I have a "/nas/home" a
"/nas/media" and a "nas/backup". Every other root directory still works
fine (ie., var, home, usr, sbin, proc, sys, etc, ....) but the moment I
try to read "/nas" the process hangs.

It is clear the hang centers around "schedule_preempt_disabled" and
"schedule" after doing "mutex_lock" and "__mutex_lock_slowpath" after
doing a bunch of CIFS stuff.

I don't know enough about the VFS and scheduling to know what this
means, but what is the issue here exactly?

This is not just some IO choke like Reindl Harald mentioned. This is a
deadlock.

--
To unsubscribe from this list go to the following URL and read the
instructions: https://lists.samba.org/mailman/options/samba